Add suppport for AMD uProf in fox #125

Manually merged
rarias merged 10 commits from amd-uprof into master 2025-09-19 10:56:50 +02:00
Owner

Adds support for AMD μProf, which allows users to extract low level performance metrics from the AMD CPUs in fox. It comes with a custom driver, which needs a bit of patching to adapt to the latest kernel.

Added a systemd service instead of udev, because the module is not emitting uevents so it doesn't seem to trigger udevd.

@varcila let us know if you find any issue, so far it seems to open fine.

image.png

Adds support for [AMD μProf](https://www.amd.com/es/developer/uprof.html), which allows users to extract low level performance metrics from the AMD CPUs in fox. It comes with a custom driver, which needs a bit of patching to adapt to the latest kernel. Added a systemd service instead of udev, because the module is not emitting uevents so it doesn't seem to trigger udevd. @varcila let us know if you find any issue, so far it seems to open fine. ![image.png](/attachments/ecb5feea-02d8-4491-83f0-e0a42ac2207c)
rarias added 2 commits 2025-07-02 15:06:45 +02:00
rarias force-pushed amd-uprof from e0ecd2be0c to 9c32f54bf4 2025-09-04 11:13:17 +02:00 Compare
rarias changed title from WIP: Add suppport for AMD uProf in fox to Add suppport for AMD uProf in fox 2025-09-04 11:13:37 +02:00
rarias added 2 commits 2025-09-04 12:26:01 +02:00
Fixes the build in Linux 6.15.6, as it was not able to find the include
files.
The hrtimer_init() is now done via hrtimer_setup() with the callback
function as argument.

See: https://lwn.net/Articles/996598/
rarias added 4 commits 2025-09-05 12:53:06 +02:00
rarias force-pushed amd-uprof from 51a77777f1 to 93cc24a40b 2025-09-05 12:55:13 +02:00 Compare
rarias added 1 commit 2025-09-05 13:55:26 +02:00
Collaborator

Sorry for the delay! The following command fails as follows:

$ AMDuProfSys collect --config core,l3,df -C 0-10 -o u3 stress -c 10 -m 5  --vm-keep -t 1
[959248] Error loading Python lib '/tmp/nix-shell.GRUNhy/_MEIW2arhq/libpython3.7m.so.1.0': dlopen: libcrypt.so.1: cannot open shared object file: No such file or directory

The following command should be enough to test that we can capture L3 uncore events. If this works, I am pretty confident the rest of the perf events can be captured:

$ AMDuProfPcm -m l3 -O uprof2  stress -c 10 -m 5  --vm-keep -t 1
Warning: Unable to find libnuma.so.
Info: Collecting system wide data since launch app monitoring is not supported when collecting L3/DF Metrics.
Info: Collect "ipc" along with "l3" for L3 pti metrics.
Profiling started
stress: info: [959586] dispatching hogs: 10 cpu, 0 io, 5 vm, 0 hdd
stress: info: [959586] successful run completed in 1s
Generated Timeseriesdata file path: uprof2/AMDuProfPcm-Timeseries_Sep-16-2025_12-38-29/report-timeseries.csv

Note the warning on not finding libnuma. Other similar warnings happen with other commands, but are not critical.

Sorry for the delay! The following command fails as follows: ``` $ AMDuProfSys collect --config core,l3,df -C 0-10 -o u3 stress -c 10 -m 5 --vm-keep -t 1 [959248] Error loading Python lib '/tmp/nix-shell.GRUNhy/_MEIW2arhq/libpython3.7m.so.1.0': dlopen: libcrypt.so.1: cannot open shared object file: No such file or directory ``` The following command should be enough to test that we can capture L3 uncore events. If this works, I am pretty confident the rest of the perf events can be captured: ``` $ AMDuProfPcm -m l3 -O uprof2 stress -c 10 -m 5 --vm-keep -t 1 Warning: Unable to find libnuma.so. Info: Collecting system wide data since launch app monitoring is not supported when collecting L3/DF Metrics. Info: Collect "ipc" along with "l3" for L3 pti metrics. Profiling started stress: info: [959586] dispatching hogs: 10 cpu, 0 io, 5 vm, 0 hdd stress: info: [959586] successful run completed in 1s Generated Timeseriesdata file path: uprof2/AMDuProfPcm-Timeseries_Sep-16-2025_12-38-29/report-timeseries.csv ``` Note the warning on not finding libnuma. Other similar warnings happen with other commands, but are not critical.
Collaborator

I would add the uprof user guide to the docs (gitea does not allow me to upload here, file too large).
Online uprof user guide updated to June 2025

I would add the uprof user guide to the docs (gitea does not allow me to upload here, file too large). [Online uprof user guide updated to June 2025 ](https://docs.amd.com/viewer/book-attachment/~u3efRNHg3qXrkeeZMlmKQ/N~ecq~c9_mnF2Xb7GS7RXg-~u3efRNHg3qXrkeeZMlmKQ)
rarias added 2 commits 2025-09-16 16:12:33 +02:00
It tries to dlopen libcrypt.so.1 and libstdc++.so.6, so we make sure
they are available by adding them to the runpath.
Author
Owner

Added a couple of fixes, now it should work:

fox% nix shell nixpkgs#stress
fox% AMDuProfSys collect --config core,l3,df -C 0-10 -o u3 stress -c 10 -m 5  --vm-keep -t 1

Collecting profile data for stress -c 10 -m 5 --vm-keep -t 1 using "pmc-driver"...

Running AMDuProfCLI stat
/nix/store/mbkvnvj9mhffd9l7fc9369qk6wvd18xm-AMD-uProf-5.1.701/bin/AMDuProfCLI
stress: info: [970945] dispatching hogs: 10 cpu, 0 io, 5 vm, 0 hdd
stress: info: [970945] successful run completed in 2s
Profiling (data collection) completed
Generated data files path: /home/Computational/rarias/jungle/u3/AMDuProf-stress-Stat

To generate report use:
 "/nix/store/mbkvnvj9mhffd9l7fc9369qk6wvd18xm-AMD-uProf-5.1.701/bin/AMDPerf/AMDuProfSys" report -i "/home/Computational/rarias/jungle/u3/u3.ses"
 

fox% AMDuProfSys report -i u3/u3.ses

Generating report file...

100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11/11 [00:00<00:00, 332.87it/s]

Generated report: /home/Computational/rarias/jungle/u3/u3_core.csv
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 552.10it/s]

Generated report: /home/Computational/rarias/jungle/u3/u3_df.csv
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 24/24 [00:00<00:00, 4635.66it/s]

Generated report: /home/Computational/rarias/jungle/u3/u3_l3.csv

Reporting time:0H:0M:0S:71m


fox% AMDuProfPcm -m l3 -O uprof2  stress -c 10 -m 5  --vm-keep -t 1
Warning: Unable to find libnuma.so.
Info: Collecting system wide data since launch app monitoring is not supported when collecting L3/DF Metrics.
Info: Collect "ipc" along with "l3" for L3 pti metrics.
Profiling started
stress: info: [971184] dispatching hogs: 10 cpu, 0 io, 5 vm, 0 hdd
stress: info: [971184] successful run completed in 1s
Generated Timeseriesdata file path: uprof2/AMDuProfPcm-Timeseries_Sep-16-2025_16-11-41/report-timeseries.csv


fox% tail uprof2/AMDuProfPcm-Timeseries_Sep-16-2025_16-11-41/report-timeseries.csv
CPI    : CPU Cycles Per Instructions
pti    : Per Thousand Instructions
ptc    : Per Thousand CPU Cycles
Topdown metrics are reported in "% slots"

Profile Time: 2025/09/16 16:11:41:298
L3 METRICS,,,,,
System (Aggregated),,,,,
L3 Access,L3 Miss,L3 Miss %,L3 Hit %,Ave L3 Miss Latency (ns),
1137522015.00,359028747.00,31.56,68.44,99.23,
Added a couple of fixes, now it should work: ``` fox% nix shell nixpkgs#stress fox% AMDuProfSys collect --config core,l3,df -C 0-10 -o u3 stress -c 10 -m 5 --vm-keep -t 1 Collecting profile data for stress -c 10 -m 5 --vm-keep -t 1 using "pmc-driver"... Running AMDuProfCLI stat /nix/store/mbkvnvj9mhffd9l7fc9369qk6wvd18xm-AMD-uProf-5.1.701/bin/AMDuProfCLI stress: info: [970945] dispatching hogs: 10 cpu, 0 io, 5 vm, 0 hdd stress: info: [970945] successful run completed in 2s Profiling (data collection) completed Generated data files path: /home/Computational/rarias/jungle/u3/AMDuProf-stress-Stat To generate report use: "/nix/store/mbkvnvj9mhffd9l7fc9369qk6wvd18xm-AMD-uProf-5.1.701/bin/AMDPerf/AMDuProfSys" report -i "/home/Computational/rarias/jungle/u3/u3.ses" fox% AMDuProfSys report -i u3/u3.ses Generating report file... 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11/11 [00:00<00:00, 332.87it/s] Generated report: /home/Computational/rarias/jungle/u3/u3_core.csv 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 552.10it/s] Generated report: /home/Computational/rarias/jungle/u3/u3_df.csv 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 24/24 [00:00<00:00, 4635.66it/s] Generated report: /home/Computational/rarias/jungle/u3/u3_l3.csv Reporting time:0H:0M:0S:71m fox% AMDuProfPcm -m l3 -O uprof2 stress -c 10 -m 5 --vm-keep -t 1 Warning: Unable to find libnuma.so. Info: Collecting system wide data since launch app monitoring is not supported when collecting L3/DF Metrics. Info: Collect "ipc" along with "l3" for L3 pti metrics. Profiling started stress: info: [971184] dispatching hogs: 10 cpu, 0 io, 5 vm, 0 hdd stress: info: [971184] successful run completed in 1s Generated Timeseriesdata file path: uprof2/AMDuProfPcm-Timeseries_Sep-16-2025_16-11-41/report-timeseries.csv fox% tail uprof2/AMDuProfPcm-Timeseries_Sep-16-2025_16-11-41/report-timeseries.csv CPI : CPU Cycles Per Instructions pti : Per Thousand Instructions ptc : Per Thousand CPU Cycles Topdown metrics are reported in "% slots" Profile Time: 2025/09/16 16:11:41:298 L3 METRICS,,,,, System (Aggregated),,,,, L3 Access,L3 Miss,L3 Miss %,L3 Hit %,Ave L3 Miss Latency (ns), 1137522015.00,359028747.00,31.56,68.44,99.23, ```
Author
Owner

I would add the uprof user guide to the docs (gitea does not allow me to upload here, file too large).
Online uprof user guide updated to June 2025

I would prefer to add a link to the document so we don't make the repository too large. Maybe we can add this one to the fox page? https://docs.amd.com/r/en-US/57368-uProf-user-guide

> I would add the uprof user guide to the docs (gitea does not allow me to upload here, file too large). > [Online uprof user guide updated to June 2025 ](https://docs.amd.com/viewer/book-attachment/~u3efRNHg3qXrkeeZMlmKQ/N~ecq~c9_mnF2Xb7GS7RXg-~u3efRNHg3qXrkeeZMlmKQ) I would prefer to add a link to the document so we don't make the repository too large. Maybe we can add this one to the fox page? https://docs.amd.com/r/en-US/57368-uProf-user-guide
rarias added 1 commit 2025-09-16 16:20:44 +02:00
Collaborator

I would prefer to add a link to the document so we don't make the repository too large. Maybe we can add this one to the fox page? https://docs.amd.com/r/en-US/57368-uProf-user-guide

The file is heavy, but since AMD/Intel links sometimes change or break, I’d prefer including the PDF. It’s about 20 MiB, so not huge. That said, totally fine if you’d rather just link it

> I would prefer to add a link to the document so we don't make the repository too large. Maybe we can add this one to the fox page? https://docs.amd.com/r/en-US/57368-uProf-user-guide The file is heavy, but since AMD/Intel links sometimes change or break, I’d prefer including the PDF. It’s about 20 MiB, so not huge. That said, totally fine if you’d rather just link it
Collaborator

I would prefer to add a link to the document so we don't make the repository too large. Maybe we can add this one to the fox page? https://docs.amd.com/r/en-US/57368-uProf-user-guide

The file is heavy, but since AMD/Intel links sometimes change or break, I’d prefer including the PDF. It’s about 20 MiB, so not huge. That said, totally fine if you’d rather just link it

We already have some pdf files in jungle in doc:

$ nix flake metadata --json | jq .path | xargs du -ha | sort -h | tail -n10
924K    /nix/store/z6q4gy9fq0413nfvb9j3ri0rqgdzc8md-source/web/themes
980K    /nix/store/z6q4gy9fq0413nfvb9j3ri0rqgdzc8md-source/m
1.1M    /nix/store/z6q4gy9fq0413nfvb9j3ri0rqgdzc8md-source/web/static
1.4M    /nix/store/z6q4gy9fq0413nfvb9j3ri0rqgdzc8md-source/web/content
3.4M    /nix/store/z6q4gy9fq0413nfvb9j3ri0rqgdzc8md-source/doc/SEL_TroubleshootingGuide.pdf
3.4M    /nix/store/z6q4gy9fq0413nfvb9j3ri0rqgdzc8md-source/web
7.7M    /nix/store/z6q4gy9fq0413nfvb9j3ri0rqgdzc8md-source/doc/Intel_Server_Board_S2600WF_TPS_2_6.pdf
14M     /nix/store/z6q4gy9fq0413nfvb9j3ri0rqgdzc8md-source/doc/R1000WF_SystemIntegration_and_ServiceGuide_Rev2_4.pdf
25M     /nix/store/z6q4gy9fq0413nfvb9j3ri0rqgdzc8md-source/doc
30M     /nix/store/z6q4gy9fq0413nfvb9j3ri0rqgdzc8md-source

Notice that from the total 30Mib in the flake source 25Mib are pdfs. It's not ideal since this will be copied to the store every time we evaluate a different revision (although when we only need it for hut/tent which host the docs).

This is already a problem with nixpkgs repo, where the flake source takes 400Mb which can add up quickly.

I am not opposed to adding the pdf, but we should be wary of this and consider other alternatives to host large files out of our source tree.

> > I would prefer to add a link to the document so we don't make the repository too large. Maybe we can add this one to the fox page? https://docs.amd.com/r/en-US/57368-uProf-user-guide > > The file is heavy, but since AMD/Intel links sometimes change or break, I’d prefer including the PDF. It’s about 20 MiB, so not huge. That said, totally fine if you’d rather just link it We already have some pdf files in jungle in doc: ``` $ nix flake metadata --json | jq .path | xargs du -ha | sort -h | tail -n10 924K /nix/store/z6q4gy9fq0413nfvb9j3ri0rqgdzc8md-source/web/themes 980K /nix/store/z6q4gy9fq0413nfvb9j3ri0rqgdzc8md-source/m 1.1M /nix/store/z6q4gy9fq0413nfvb9j3ri0rqgdzc8md-source/web/static 1.4M /nix/store/z6q4gy9fq0413nfvb9j3ri0rqgdzc8md-source/web/content 3.4M /nix/store/z6q4gy9fq0413nfvb9j3ri0rqgdzc8md-source/doc/SEL_TroubleshootingGuide.pdf 3.4M /nix/store/z6q4gy9fq0413nfvb9j3ri0rqgdzc8md-source/web 7.7M /nix/store/z6q4gy9fq0413nfvb9j3ri0rqgdzc8md-source/doc/Intel_Server_Board_S2600WF_TPS_2_6.pdf 14M /nix/store/z6q4gy9fq0413nfvb9j3ri0rqgdzc8md-source/doc/R1000WF_SystemIntegration_and_ServiceGuide_Rev2_4.pdf 25M /nix/store/z6q4gy9fq0413nfvb9j3ri0rqgdzc8md-source/doc 30M /nix/store/z6q4gy9fq0413nfvb9j3ri0rqgdzc8md-source ``` Notice that from the total 30Mib in the flake source 25Mib are pdfs. It's not ideal since this will be copied to the store every time we evaluate a different revision (although when we only need it for hut/tent which host the docs). This is already a problem with nixpkgs repo, where the flake source takes 400Mb which can add up quickly. I am not opposed to adding the pdf, but we should be wary of this and consider other alternatives to host large files out of our source tree.
Author
Owner

If you worry it disappears we can keep a copy elsewhere. However, it is likely that the user guide will change as they update their version.

Notice that from the total 30Mib in the flake source 25Mib are pdfs.

I wanted to test Git LFS For this use case, but never used it before. Gitea has support for it.

I’m aware of the already big size of the repo, if git lfs works okay we may be able to rewrite the history to get rid of the big blobs and keep other docs as well.

Let’s address first the blockers of this PR so we can merge it and then evaluate other solutions in another issue.

If you worry it disappears we can keep a copy elsewhere. However, it is likely that the user guide will change as they update their version. > Notice that from the total 30Mib in the flake source 25Mib are pdfs. I wanted to test Git LFS For this use case, but never used it before. Gitea has support for it. I’m aware of the already big size of the repo, if git lfs works okay we may be able to rewrite the history to get rid of the big blobs and keep other docs as well. Let’s address first the blockers of this PR so we can merge it and then evaluate other solutions in another issue.
rarias force-pushed amd-uprof from ba98023645 to c8bc19d891 2025-09-17 13:22:07 +02:00 Compare
Author
Owner

Uploaded a copy of the PDF to jungle web server for now:

ba98023645..c8bc19d891

It is here: https://jungle.bsc.es/pub/57368-uprof-user-guide.pdf

Uploaded a copy of the PDF to jungle web server for now: https://jungle.bsc.es/git/rarias/jungle/compare/ba980236452884a7762cddf58ba05084dfa7fb08..c8bc19d891d204fc7e6ee605cc6fbac013c299ac It is here: https://jungle.bsc.es/pub/57368-uprof-user-guide.pdf
rarias requested review from abonerib 2025-09-17 13:24:51 +02:00
rarias requested review from varcila 2025-09-17 13:24:51 +02:00
Collaborator

Kernel module amd_hsmp seems to be missing. I tried a command I used for getting the power metrics and a bunch of other events, and it complains about not having said module. I can confirm that the version of uProf used for the thesis did report power metrics with this specific command.

The command, also showing the output:

❯ AMDuProfPcm profile -I 100 --collect-power -O out_stress_amduprof stress -c 10 -m 5  --vm-keep -t 1
Warning: Unable to find libnuma.so.
Warning: print interval is less than default interval and setting it to default interval 1000
Info: Collecting system wide data since launch app monitoring is not supported when collecting L3/DF Metrics.
**Warning: Power metrics are not supported in this platform due to non-availability of amd_hsmp driver.
Try loading amd_hsmp module with "sudo modprobe amd_hsmp".**
Profiling started
stress: info: [973208] dispatching hogs: 10 cpu, 0 io, 5 vm, 0 hdd
stress: info: [973208] successful run completed in 1s
Generated Timeseriesdata file path: out_stress_amduprof/AMDuProfPcm-Multi_Sep-17-2025_18-42-00/report-timeseries.csv
Generated Cumulativedata file path: out_stress_amduprof/AMDuProfPcm-Multi_Sep-17-2025_18-42-00/report-cumulative.csv
Generated HTML report at out_stress_amduprof/AMDuProfPcm-Multi_Sep-17-2025_18-42-00/report.html
Open the files in browser to view results.
Kernel module `amd_hsmp` seems to be missing. I tried a command I used for getting the power metrics and a bunch of other events, and it complains about not having said module. I can confirm that the version of uProf used for the thesis did report power metrics with this specific command. The command, also showing the output: ``` ❯ AMDuProfPcm profile -I 100 --collect-power -O out_stress_amduprof stress -c 10 -m 5 --vm-keep -t 1 Warning: Unable to find libnuma.so. Warning: print interval is less than default interval and setting it to default interval 1000 Info: Collecting system wide data since launch app monitoring is not supported when collecting L3/DF Metrics. **Warning: Power metrics are not supported in this platform due to non-availability of amd_hsmp driver. Try loading amd_hsmp module with "sudo modprobe amd_hsmp".** Profiling started stress: info: [973208] dispatching hogs: 10 cpu, 0 io, 5 vm, 0 hdd stress: info: [973208] successful run completed in 1s Generated Timeseriesdata file path: out_stress_amduprof/AMDuProfPcm-Multi_Sep-17-2025_18-42-00/report-timeseries.csv Generated Cumulativedata file path: out_stress_amduprof/AMDuProfPcm-Multi_Sep-17-2025_18-42-00/report-cumulative.csv Generated HTML report at out_stress_amduprof/AMDuProfPcm-Multi_Sep-17-2025_18-42-00/report.html Open the files in browser to view results. ```
rarias added 1 commit 2025-09-18 12:39:36 +02:00
Author
Owner

Kernel module amd_hsmp seems to be missing.

Fixed now:

fox% AMDuProfPcm profile -I 100 --collect-power -O s sleep 1
Warning: Unable to find libnuma.so.
Warning: print interval is less than default interval and setting it to default interval 1000
Info: Collecting system wide data since launch app monitoring is not supported when collecting L3/DF Metrics.
Profiling started
Generated Timeseriesdata file path: s/AMDuProfPcm-Multi_Sep-18-2025_12-38-54/report-timeseries.csv
Generated Cumulativedata file path: s/AMDuProfPcm-Multi_Sep-18-2025_12-38-54/report-cumulative.csv
Generated HTML report at s/AMDuProfPcm-Multi_Sep-18-2025_12-38-54/report.html
Open the files in browser to view results.

Regarding that "Unable to find libnuma.so", they seem to be dlopening libnuma from a hardcoded path:

│    │╎╎╎   0x00517a1e      4885c0         test rax, rax
│   ┌─────< 0x00517a21      0f85a9000000   jne 0x517ad0
│   ││╎╎╎   0x00517a27      be01000000     mov esi, 1
│   ││╎╎╎   0x00517a2c      bf11295600     mov edi, str._usr_lib64_libnuma.so    ; 0x562911 ; "/usr/lib64/libnuma.so"
│   ││╎╎╎   0x00517a31      e8ea25efff     call sym.imp.dlopen         ;[3]
│   ││╎╎╎   0x00517a36      488903         mov qword [rbx], rax
│   ││╎╎╎   0x00517a39      4885c0         test rax, rax
│   ││╎└──< 0x00517a3c      0f8526ffffff   jne 0x517968
│   ││╎ ╎   0x00517a42      ba23000000     mov edx, 0x23               ; '#' ; 35
│   ││╎ ╎   0x00517a47      be68295600     mov esi, str.Warning:_Unable_to_find_libnuma.so.    ; 0x562968 ; "Warning: Unable to find libnuma.so."
│   ││╎ ╎   0x00517a4c      bfa0ec5900     mov edi, obj.std::cerr      ; 0x59eca0

So even if we add it to the runpath it doesn't work. Let me see if I can patch the string so that it dlopens only "libnuma.so".

> Kernel module `amd_hsmp` seems to be missing. Fixed now: ``` fox% AMDuProfPcm profile -I 100 --collect-power -O s sleep 1 Warning: Unable to find libnuma.so. Warning: print interval is less than default interval and setting it to default interval 1000 Info: Collecting system wide data since launch app monitoring is not supported when collecting L3/DF Metrics. Profiling started Generated Timeseriesdata file path: s/AMDuProfPcm-Multi_Sep-18-2025_12-38-54/report-timeseries.csv Generated Cumulativedata file path: s/AMDuProfPcm-Multi_Sep-18-2025_12-38-54/report-cumulative.csv Generated HTML report at s/AMDuProfPcm-Multi_Sep-18-2025_12-38-54/report.html Open the files in browser to view results. ``` Regarding that "Unable to find libnuma.so", they seem to be dlopening libnuma from a hardcoded path: ``` │ │╎╎╎ 0x00517a1e 4885c0 test rax, rax │ ┌─────< 0x00517a21 0f85a9000000 jne 0x517ad0 │ ││╎╎╎ 0x00517a27 be01000000 mov esi, 1 │ ││╎╎╎ 0x00517a2c bf11295600 mov edi, str._usr_lib64_libnuma.so ; 0x562911 ; "/usr/lib64/libnuma.so" │ ││╎╎╎ 0x00517a31 e8ea25efff call sym.imp.dlopen ;[3] │ ││╎╎╎ 0x00517a36 488903 mov qword [rbx], rax │ ││╎╎╎ 0x00517a39 4885c0 test rax, rax │ ││╎└──< 0x00517a3c 0f8526ffffff jne 0x517968 │ ││╎ ╎ 0x00517a42 ba23000000 mov edx, 0x23 ; '#' ; 35 │ ││╎ ╎ 0x00517a47 be68295600 mov esi, str.Warning:_Unable_to_find_libnuma.so. ; 0x562968 ; "Warning: Unable to find libnuma.so." │ ││╎ ╎ 0x00517a4c bfa0ec5900 mov edi, obj.std::cerr ; 0x59eca0 ``` So even if we add it to the runpath it doesn't work. Let me see if I can patch the string so that it dlopens only "libnuma.so".
rarias added 1 commit 2025-09-18 13:35:41 +02:00
We change the search procedure so it detects NixOS from /etc/os-release
and uses "libnuma.so" when calling dlopen, instead of harcoding a full
path to /usr. The full patch of libnuma is stored in the runpath, so
dlopen can find it.
Author
Owner

Patched:

fox% AMDuProfPcm profile -I 100 --collect-power -O s sleep 1
Warning: print interval is less than default interval and setting it to default interval 1000
Info: Collecting system wide data since launch app monitoring is not supported when collecting L3/DF Metrics.
Profiling started
Generated Timeseriesdata file path: s/AMDuProfPcm-Multi_Sep-18-2025_13-34-31/report-timeseries.csv
Generated Cumulativedata file path: s/AMDuProfPcm-Multi_Sep-18-2025_13-34-31/report-cumulative.csv
Generated HTML report at s/AMDuProfPcm-Multi_Sep-18-2025_13-34-31/report.html
Open the files in browser to view results.
Patched: ``` fox% AMDuProfPcm profile -I 100 --collect-power -O s sleep 1 Warning: print interval is less than default interval and setting it to default interval 1000 Info: Collecting system wide data since launch app monitoring is not supported when collecting L3/DF Metrics. Profiling started Generated Timeseriesdata file path: s/AMDuProfPcm-Multi_Sep-18-2025_13-34-31/report-timeseries.csv Generated Cumulativedata file path: s/AMDuProfPcm-Multi_Sep-18-2025_13-34-31/report-cumulative.csv Generated HTML report at s/AMDuProfPcm-Multi_Sep-18-2025_13-34-31/report.html Open the files in browser to view results. ```
varcila approved these changes 2025-09-18 14:24:35 +02:00
varcila left a comment
Collaborator

LGTM

LGTM
abonerib reviewed 2025-09-19 09:58:38 +02:00
@ -0,0 +25,4 @@
version = "5.1.701";
tarball = "AMDuProf_Linux_x64_${version}.tar.bz2";
uprofSrc = runCommandLocal tarball {
Collaborator

I would add a comment to remember to update the radare patch addresses when changing the source.

I would add a comment to remember to update the radare patch addresses when changing the source.
Author
Owner

I'll add a md5sum check as well.

I'll add a md5sum check as well.
rarias marked this conversation as resolved
@ -0,0 +2,4 @@
, lib
, amd-uprof
, curl
, cacert
Collaborator

curl and cacert were left over.

`curl` and `cacert` were left over.
rarias marked this conversation as resolved
@ -0,0 +2,4 @@
# so it matches NixOS:
#
# Change OS name to NixOS
wz NixOS @ 0x00550a43
Collaborator

I was fiddling with radare, and we could try to make this more robust by doing something like: / original string ; wz NixOS @ hit0_0 and idem for libnuma

But I don't know how to get the address of the mov ecx instruction so it's a bit pointless. I could not figure out how to get radare to stop the execution of the script when an error occurs either.

I was fiddling with radare, and we could try to make this more robust by doing something like: `/ original string ; wz NixOS @ hit0_0` and idem for `libnuma` But I don't know how to get the address of the mov ecx instruction so it's a bit pointless. I could not figure out how to get radare to stop the execution of the script when an error occurs either.
Author
Owner

Is not safe to adapt the patch automatically, it would need to be manually updated. Ideally we should ask upstream to add the NixOX option (or use it as fallback if none matches).

Is not safe to adapt the patch automatically, it would need to be manually updated. Ideally we should ask upstream to add the NixOX option (or use it as fallback if none matches).
Collaborator

Yeah, I wanted the derivation to fail if radare found something amiss, but the md5 checksum is cleaner.

Yeah, I wanted the derivation to fail if radare found something amiss, but the md5 checksum is cleaner.
abonerib marked this conversation as resolved
rarias force-pushed amd-uprof from 4e3c41c9fa to 967709982a 2025-09-19 10:30:27 +02:00 Compare
abonerib approved these changes 2025-09-19 10:44:55 +02:00
rarias force-pushed amd-uprof from 967709982a to 017e0d82f7 2025-09-19 10:56:34 +02:00 Compare
rarias manually merged commit 017e0d82f7 into master 2025-09-19 10:56:50 +02:00
Sign in to join this conversation.
No Reviewers
3 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: rarias/jungle#125
No description provided.