Skip to content

ruby: add clangarm64 target #13115

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

dennisameling
Copy link
Contributor

This enables the clangarm64 target. The build is currently failing with unexpected ucrtbase.dll. I saw this related issue and would appreciate getting some pointers on how to fix this in a similar way that was done in this PR. Happy to provide an upstream PR once we have this working. CC @jeremyd2019 🙏🏼

configure: ruby library version = 3.1.0
configure: creating ./config.status
config.status: creating GNUmakefile
config.status: creating Makefile
---
Configuration summary for ruby version 3.1.2

   * Installation prefix: /clangarm64
   * exec prefix:         ${prefix}
   * arch:                aarch64-mingw-ucrt
   * site arch:           aarch64-ucrt
   * RUBY_BASE_NAME:      ruby
   * enable shared:       yes
   * ruby lib prefix:     ${libdir}/${RUBY_BASE_NAME}
   * site libraries path: ${rubylibprefix}/${sitearch}
   * vendor path:         ${rubylibprefix}/vendor_ruby
   * target OS:           mingw32
   * compiler:            clang
   * with pthread:        no
   * with coroutine:      arm64
   * enable shared libs:  yes
   * dynamic library ext: so
   * CFLAGS:              -fdeclspec ${optflags} ${debugflags} ${warnflags}
   * LDFLAGS:             -L. -pipe
   * DLDFLAGS:            -pipe \
                          -Wl,--enable-auto-image-base,--enable-auto-import
   * optflags:            -O3 -fno-omit-frame-pointer -fno-fast-math
   * debugflags:          -ggdb3
   * warnflags:           -Wall -Wextra -Wdeprecated-declarations \
                          -Wdivision-by-zero \
                          -Wimplicit-function-declaration -Wimplicit-int \
                          -Wmisleading-indentation -Wpointer-arith \
                          -Wshorten-64-to-32 -Wwrite-strings \
                          -Wold-style-definition -Wmissing-noreturn \
                          -Wno-cast-function-type \
                          -Wno-constant-logical-operand -Wno-long-long \
                          -Wno-missing-field-initializers \
                          -Wno-overlength-strings -Wno-parentheses-equality \
                          -Wno-self-assign -Wno-tautological-compare \
                          -Wno-unused-parameter -Wno-unused-value \
                          -Wunused-variable -Wextra-tokens -Wundef
   * strip command:       llvm-strip -S -x
   * install doc:         rdoc
   * JIT support:         yes
   * man page type:       doc

---

the error:

linking miniruby.exe
generating encdb.h
unexpected ucrtbase.dll
unexpected ucrtbase.dll
unexpected ucrtbase.dll
make: *** [uncommon.mk:1178: builtin_binary.inc] Error 1
make: *** Waiting for unfinished jobs....
make: *** [uncommon.mk:841: .rbconfig.time] Error 1
make: *** [uncommon.mk:1129: encdb.h] Error 1
==> ERROR: A failure occurred in build().
    Aborting...

Comment on lines +21 to +26
+ frame.AddrPC.Mode = AddrModeFlat;
+ frame.AddrPC.Offset = context.Pc;
+ frame.AddrFrame.Mode = AddrModeFlat;
+ frame.AddrFrame.Offset = context.Fp;
+ frame.AddrStack.Mode = AddrModeFlat;
+ frame.AddrStack.Offset = context.Sp;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeremyd2019
Copy link
Member

That code looks at the machine code for _isatty to try to find the address of the non-exported __pioinfo pointer. ARM/ARM64 assembly isn't something I'm very familiar with, but I recently read some articles about how things work there, and it seems how accesses to globals work there is different from either x86 or x86-64.

These are probably the instructions for arm64

00007ffd`c6b2072c 480e00f0 adrp    x8, ucrtbase!type_info `RTTI Type Descriptor'+0xf8 (7ffdc6ceb000)
00007ffd`c6b20730 08013791 add     x8, x8, #0xDC0

@dennisameling
Copy link
Contributor Author

What tool do you use to get that info? Happy to take a look but I'm really new to this stuff. Am I right in assuming this is mostly a matter of "finding at what address this code is defined" so the compiler can find it?

@dennisameling
Copy link
Contributor Author

Closing this as I was able to work around the dependency on Ruby in my project. Unfortunately don't have the bandwidth currently to dive deeper into this

@jeremyd2019
Copy link
Member

Cool. This code is really gross, but upstream ruby seemed insistent that they really needed that private pointer for something. I used WinDBG to disassemble ucrtbase!_isatty, and then what I learned from that ARM64 thread on oldnewthing combined with the patterns I remember from x86 and x64 when I was fixing Windows 11 32-bit to make an educated guess which instruction is interesting.

@Biswa96
Copy link
Member

Biswa96 commented Sep 25, 2022

The assembly look like this adrl x8, __pioinfo. Do we need the value of x8 in that state?

@jeremyd2019
Copy link
Member

Yes

@Biswa96
Copy link
Member

Biswa96 commented Oct 1, 2022

Can anyone compile the aarch64 ruby package with the following patch? Just for fun.

--- a/win32/win32.c
+++ b/win32/win32.c
@@ -2622,15 +2622,13 @@
             }
         }
     }
-    fprintf(stderr, "unexpected " UCRTBASE "\n");
-    _exit(1);
 
     found:
     p += sizeof(PIOINFO_MARK) - 1;
 #ifdef _WIN64
     rel = *(int32_t*)(p);
     rip = p + sizeof(int32_t);
-    __pioinfo = (ioinfo**)(rip + rel);
+    __pioinfo = 0xDEADBEEF;
 #else
     __pioinfo = *(ioinfo***)(p);
 #endif

@dennisameling
Copy link
Contributor Author

dennisameling commented Oct 1, 2022

@Biswa96 that results in

../ruby-3.1.2/win32/win32.c:2631:15: error: incompatible integer to pointer conversion assigning to 'ioinfo **' from 'unsigned int' [-Wint-conversion]
    __pioinfo = 0xDEADBEEF;
              ^ ~~~~~~~~~~

When I change it to __pioinfo = (ioinfo**)0xDEADBEEF; instead, the file compiles, but then I'm getting:

../ruby-3.1.2/revision.h unchanged
linking miniruby.exe
generating encdb.h
make: *** [uncommon.mk:841: .rbconfig.time] Segmentation fault
make: *** Waiting for unfinished jobs....
make: *** [uncommon.mk:1129: encdb.h] Segmentation fault
make: *** [uncommon.mk:1178: builtin_binary.inc] Segmentation fault
==> ERROR: A failure occurred in build().
    Aborting...

Here's more logs from the build: build.txt

Happy to test further and/or to give you access to an ARM64 VM for further testing.

@Biswa96
Copy link
Member

Biswa96 commented Oct 1, 2022

Opps! I forgot ruby compiles itself first to create miniruby executable. My previous comment is just for having fun with ruby, not so serious. Here was the plan 😈

  1. I can not find any "marker" bytes like PIOINFO_MARK in _isatty function in AArch64 ucrtbase.dll. So, I went with objdump.
  2. Here is the important part in _isatty function from ucrtbase.dll in AArch64 WinPE.
18001072c: f0000e48     adrp    x8, #0x1801db000
180010730: 91370108     add     x8, x8, #0xdc0
  • The offset of __pioinfo would be 0x1801db000 + 0xdc0 - 0x180000000 = 0x1dbdc0
  • The actual memory address of __pioinfo will be GetModuleHandle("ucrtbase.dll") + 0x1dbdc0
  1. I have applied that idea with this simple code https://github.com/Biswa96/Junkyard/blob/master/c/PokeIoInfo.c. And it works with x86_64 and AArch64 both.

@dennisameling
Copy link
Contributor Author

dennisameling commented Oct 1, 2022

@Biswa96 I tried cc -DIOINFO_RVA=0x1dbdc0 -o PokeIoInfo.exe PokeIoInfo.c and running it, but it results in a Segmentation fault. LLDB returns the following:

(lldb) run
Process 7164 launched: 'C:\msys64\usr\src\MINGW-packages\mingw-w64-ruby\PokeIoInfo.exe' (aarch64)
Process 7164 stopped
* thread #1, stop reason = Exception 0xc0000005 encountered at address 0x7ff76aa61588: Access violation reading location 0xc00051000254a8
    frame #0: 0x00007ff76aa61588 PokeIoInfo.exe`main + 136
PokeIoInfo.exe`main:
->  0x7ff76aa61588 <+136>: ldr    x8, [x8, #0x28]
    0x7ff76aa6158c <+140>: str    x8, [sp, #0x20]
    0x7ff76aa61590 <+144>: adrp   x8, 2
    0x7ff76aa61594 <+148>: ldr    x8, [x8, #0x848]

Am I missing something? I'm completely new to this stuff but eager to learn a bit more haha. For reference, I'm on Windows 11 22623.730.

UPDATE: I could reproduce this error on a fresh Azure ARM64 VM with 22621.521. Let me know if you want access to that for further debugging.

@Biswa96
Copy link
Member

Biswa96 commented Oct 1, 2022

That offset value (0x1dbdc0) is not constant for every Windows builds. I provided it as an example with WinPE system. It has to be calculated as I have shown.

@dennisameling
Copy link
Contributor Author

Gotcha, thanks. I did objdump --disassemble-symbols=_isatty /c/windows/system32/ucrtbase.dll and got the following output:

00000001800132e0 <_isatty>:
1800132e0: d503237f     pacibsp
1800132e4: a9bf7bfd     stp     x29, x30, [sp, #-16]!
1800132e8: 910003fd     mov     x29, sp
1800132ec: 3100081f     cmn     w0, #2
1800132f0: 54000260     b.eq    0x18001333c <_isatty+0x5c>
1800132f4: 37f88a80     tbnz    w0, #31, 0x180014444 <_Gettnames+0x5a4>
1800132f8: b0000e28     adrp    x8, 0x1801d8000 <_fflush_nolock+0x4fc>
1800132fc: b94d7d08     ldr     w8, [x8, #3452]
180013300: 6b08001f     cmp     w0, w8
180013304: 541e9442     b.hs    0x18005058c <log2f+0x199c>
180013308: 93407c09     sxtw    x9, w0
18001330c: b0000e28     adrp    x8, 0x1801d8000 <_fflush_nolock+0x510>
180013310: 9136c108     add     x8, x8, #3504
180013314: d346fd2a     lsr     x10, x9, #6
180013318: 92401529     and     x9, x9, #0x3f
18001331c: f86a7908     ldr     x8, [x8, x10, lsl #3]
180013320: d280090b     mov     x11, #72
180013324: 9b0b2129     madd    x9, x9, x11, x8
180013328: 3940e12a     ldrb    w10, [x9, #56]
18001332c: 121a0140     and     w0, w10, #0x40
180013330: a8c17bfd     ldp     x29, x30, [sp], #16
...

So that'd get us to 0x1801d8000 + 3504 - 0x180000000 = 0x1db504, right? I did cc -DIOINFO_RVA=0x1db504 -o PokeIoInfo.exe PokeIoInfo.c but am still running into issues with the resulting binary:

(lldb) run
Process 19272 launched: 'C:\msys64\usr\src\MINGW-packages\mingw-w64-ruby\PokeIoInfo.exe' (aarch64)
Process 19272 stopped
* thread #1, stop reason = Exception 0xc0000005 encountered at address 0x7ff6cc8f1588: Access violation reading location 0x1339001410061
    frame #0: 0x00007ff6cc8f1588 PokeIoInfo.exe`main + 136
PokeIoInfo.exe`main:
->  0x7ff6cc8f1588 <+136>: ldr    x8, [x8, #0x28]
    0x7ff6cc8f158c <+140>: str    x8, [sp, #0x20]
    0x7ff6cc8f1590 <+144>: adrp   x8, 2
    0x7ff6cc8f1594 <+148>: ldr    x8, [x8, #0x848]

Here's the full output of objdump -d /c/windows/system32/ucrtbase.dll:
objdump.txt

@Biswa96
Copy link
Member

Biswa96 commented Oct 1, 2022

#3504 is not in hex.

@dennisameling
Copy link
Contributor Author

dennisameling commented Oct 1, 2022

Doh! 🤦🏼‍♂️ My head is in weekend mode haha. 0x1801d8000 + 0xdb0 - 0x180000000 = 0x1d8db0

Here we go:

$ ./PokeIoInfo.exe
matched (0): 1184 1184
matched (1): 1192 1192
matched (2): 1192 1192
matched (3): -1 -1
matched (4): -1 -1
...

@dennisameling
Copy link
Contributor Author

Could you give me any pointers (pun intended) as to how I can use this info to get a __pioinfo = (ioinfo**)0x.....;? Learning as I go 😄 appreciate your help here!

@Biswa96
Copy link
Member

Biswa96 commented Oct 1, 2022

I am not familiar with ARM assembly. I just started reading the ARM manual one week ago. Others may provide some hint.

IMO, it is not possible to get that offset value of __pioinfo value in runtime with simple byte checking. It may be possible to download the pdb file of ucrtbase.dll and get the offset value with MSDIA (MS debug interface). But that would make ruby a little disassembler program.

DLLs are different in Windows 10 and 11. e.g. Win11 DLLs have pointer authentication instructions PACIBSP and AUTIBSP.

@dennisameling
Copy link
Contributor Author

Let's reopen this PR then so others can chime in 👍🏼

@dennisameling dennisameling reopened this Oct 1, 2022
@jeremyd2019
Copy link
Member

IMO, it is not possible to get that offset value of __pioinfo value in runtime with simple byte checking.

I think it is (technically, instruction checking not necessarily byte, on arm64 I think instructions are 4 bytes), with similar limitations/assumptions/brittleness that the x86/x86_64 code has. Let me try describing what it's doing at a higher level, maybe that will help.

It scans starting at the address of _isatty looking for the end of the function (a ret instruction, but the x86 code also looks for the instruction before the ret. I don't think that instruction before would work on arm64, but with instructions being 4 bytes it's probably less likely to hit a false positive than a 1-byte instruction).

From there, it looks backwards for the last instruction that loads an address (not a very good description), decodes the immediate that says where that address is (on x86_64 that means using the rip-relative offset) to get the pointer.

On ARM64, what I would try doing is scanning forward for a ret instruction, then scanning backwards from there for an adrp/add pair (adrl pseudo-instruction) on x8, and decode the immediates to calculate the pointer. I would have to read some ARM64 docs on how instructions are encoded to be able to do that accurately, I would expect going through reading uint32_ts and masking them to split the instruction from the immediates.

@Biswa96
Copy link
Member

Biswa96 commented Oct 1, 2022

Agree. I also try to follow the x86 way but some variables caught me. Can we be sure that - 1. the operand always will be x8? 2. will adrp and add be always together? 3. will there be any add or adrp after __pioinfo operation?

Speaking of docs, I have read this https://developer.arm.com/documentation/ddi0487/latest. The ARM encoding is fairly interesting to me. For ardp, see C6.2.11 section.

@jeremyd2019
Copy link
Member

Agree. I also try to follow the x86 way but some variables caught me. Can we be sure that - 1. the operand always will be x8? 2. will adrp and add be always together? 3. will there be any add or adrp after __pioinfo operation?

1 and 3, no, but those same assumptions (that it's the last one, and that it uses the same register) are made in the x86 variants. That's what I was getting at about the brittleness. 2, probably, since they form one logical operation/pseudo-instruction.

@Biswa96
Copy link
Member

Biswa96 commented Oct 1, 2022

I would expect going through reading uint32_ts and masking them to split the instruction from the immediates.

Something like this?

  1. The add operation in isatty
add x8, x8, #3520 =  91 37 01 08
  1. Extract the immediate.
(0x91370108 >> 0xA) & 0xFFF = 3520

@lazka
Copy link
Member

lazka commented Mar 10, 2023

Since there hasn't been any progress here or upstream (https://bugs.ruby-lang.org/issues/18605) I've created #16135 which just includes the patch.

@jeremyd2019
Copy link
Member

#19179

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants