Setup lake2 node for ceph storage #28

Closed
opened 2023-08-24 11:59:49 +02:00 by rarias · 22 comments
rarias commented 2023-08-24 11:59:49 +02:00 (Migrated from pm.bsc.es)

The lake2 machine will hold 7.2 TiB of disk space used for the CEPH filesystem. To install NixOS:

  • Enable serial in GRUB
  • Reboot and get a serial console
  • Boot NixOS with kexec
  • Perform the installation
  • Add NixOS entry to GRUB
  • Configure the disks for ceph
  • Set default boot entry to NixOS
  • Do some benchmarks
The lake2 machine will hold 7.2 TiB of disk space used for the CEPH filesystem. To install NixOS: - [x] Enable serial in GRUB - [x] Reboot and get a serial console - [x] Boot NixOS with kexec - [x] Perform the installation - [x] Add NixOS entry to GRUB - [x] Configure the disks for ceph - [x] Set default boot entry to NixOS - [x] Do some benchmarks
rarias commented 2023-08-24 12:32:28 +02:00 (Migrated from pm.bsc.es)

marked the checklist item Enable serial in GRUB as completed

marked the checklist item **Enable serial in GRUB** as completed
rarias commented 2023-08-24 12:32:28 +02:00 (Migrated from pm.bsc.es)

marked the checklist item Reboot and get a serial console as completed

marked the checklist item **Reboot and get a serial console** as completed
rarias commented 2023-08-24 13:33:52 +02:00 (Migrated from pm.bsc.es)

marked the checklist item Boot NixOS with kexec as completed

marked the checklist item **Boot NixOS with kexec** as completed
rarias commented 2023-08-24 14:10:52 +02:00 (Migrated from pm.bsc.es)

Hmm, not sure if the nvme disks are okay... They are taking forever to write files.

Hmm, not sure if the nvme disks are okay... They are taking forever to write files.
rarias commented 2023-08-24 15:30:52 +02:00 (Migrated from pm.bsc.es)

marked the checklist item Perform the installation as completed

marked the checklist item **Perform the installation** as completed
rarias commented 2023-08-24 16:37:11 +02:00 (Migrated from pm.bsc.es)

The BMC keeps droping the serial over LAN connection, making it a pain. Apparently, neither the BIOS or the GRUB can boot from the NVME disks.

One option is to install NixOS in its own partition in the sda disk, making some space. However, I would need to kexec again into NixOS, so the sda FS is not in use, then shrink the partition, make another one, install NixOS there and attempt to boot.

However, this risks leaving the node unbotable if the shrink process fails. Before attempting that, I should setup a secondary mechanism to boot (maybe PXE) so I can load another rescue system to fix it. The BMC droping the console every few minutes won't help either.

The BMC keeps droping the serial over LAN connection, making it a pain. Apparently, neither the BIOS or the GRUB can boot from the NVME disks. One option is to install NixOS in its own partition in the sda disk, making some space. However, I would need to kexec again into NixOS, so the sda FS is not in use, then shrink the partition, make another one, install NixOS there and attempt to boot. However, this risks leaving the node unbotable if the shrink process fails. Before attempting that, I should setup a secondary mechanism to boot (maybe PXE) so I can load another rescue system to fix it. The BMC droping the console every few minutes won't help either.
rarias commented 2023-08-25 14:16:08 +02:00 (Migrated from pm.bsc.es)

Configuring PXE with pixiecore doesn't seem to be straight forward, as I also need to setup a DNS and the serial console is a pain to use to even try to debug the PXE agent.

So, I'm overwriting the partition table from the kexec'd NixOS in RAM. Hopefully I manage to make the system boot again. Otherwise I will need to setup PXE or go there and load NixOS with an USB pendrive.

Configuring PXE with pixiecore doesn't seem to be straight forward, as I also need to setup a DNS and the serial console is a pain to use to even try to debug the PXE agent. So, I'm overwriting the partition table from the kexec'd NixOS in RAM. Hopefully I manage to make the system boot again. Otherwise I will need to setup PXE or go there and load NixOS with an USB pendrive.
rarias commented 2023-08-25 14:22:23 +02:00 (Migrated from pm.bsc.es)

It says all went okay...

lake2# nixos-install --flake .#lake2 --root /mnt
copying channel...
building the flake in git+file:///home/Computational/rarias/jungle?ref=refs/heads/lake2&rev=a2075cfd655a106b7fc5c15f18acc7bdef1bc0c1...
installing the boot loader...
setting up /etc...
updating GRUB 2 menu...
installing the GRUB 2 boot loader on /dev/disk/by-id/wwn-0x55cd2e414d53563a...
Installing for i386-pc platform.
Installation finished. No error reported.
setting up /etc...
setting up /etc...
setting root password...
New password:
Retype new password:
passwd: password updated successfully
installation finished!

The disk is marked as bootable:

lake2# fdisk -l /dev/sda
Disk /dev/sda: 223,57 GiB, 240057409536 bytes, 468862128 sectors
Disk model: INTEL SSDSC2BB24
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: dos
Disk identifier: 0xf9e2e1be

Device     Boot     Start       End   Sectors   Size Id Type
/dev/sda1  *         2048 453236735 453234688 216,1G 83 Linux
/dev/sda2       453236736 468860927  15624192   7,5G 82 Linux swap / Solaris

And the GRUB entry seems coherent:

...
terminal_input --append serial
terminal_output --append serial


menuentry "NixOS - Default" --class nixos --unrestricted {
search --set=drive1 --fs-uuid 0fd0a970-1083-451a-8ba5-bd529cb23689
search --set=drive2 --fs-uuid 0fd0a970-1083-451a-8ba5-bd529cb23689
  linux ($drive2)/nix/store/6zfnb5w30jy06qx6w0hr9zyr7rwzalz6-linux-6.4.11/bzImage init=/nix/store/f2qjfiqd48mdjv3m04w5cd85l31q2p73-nixos-system-lake2-23.11.20230819.d680ded/init console=tty1 console=ttyS0,115200 loglevel=4
  initrd ($drive2)/nix/store/71z1624fpj0l6s786fg0agyxa084xliz-initrd-linux-6.4.11/initrd
}


submenu "NixOS - All configurations" --class submenu {
menuentry "NixOS - Configuration 1 (2023-08-25 - 23.11.20230819.d680ded)" --class nixos {
search --set=drive1 --fs-uuid 0fd0a970-1083-451a-8ba5-bd529cb23689
search --set=drive2 --fs-uuid 0fd0a970-1083-451a-8ba5-bd529cb23689
  linux ($drive2)/nix/store/6zfnb5w30jy06qx6w0hr9zyr7rwzalz6-linux-6.4.11/bzImage init=/nix/store/f2qjfiqd48mdjv3m04w5cd85l31q2p73-nixos-system-lake2-23.11.20230819.d680ded/init console=tty1 console=ttyS0,115200 loglevel=4
  initrd ($drive2)/nix/store/71z1624fpj0l6s786fg0agyxa084xliz-initrd-linux-6.4.11/initrd
}

}

And generated fstab looks good too:

# This is a generated file.  Do not edit!
#
# To make changes, edit the fileSystems and swapDevices NixOS options
# in your /etc/nixos/configuration.nix file.
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>

# Filesystems.
/dev/disk/by-label/nixos / ext4 x-initrd.mount 0 1
10.0.40.30:/home /home nfs nfsvers=3,rsize=1024,wsize=1024,cto,nofail 0 0
none /sys/kernel/tracing tracefs defaults 0 0


# Swap devices.
/dev/disk/by-label/swap none swap defaults

Let's try luck, I won't be able to see the output via serial, so if I don't get a shell via SSH something went wrong. Rebooting...

It says all went okay... ``` lake2# nixos-install --flake .#lake2 --root /mnt copying channel... building the flake in git+file:///home/Computational/rarias/jungle?ref=refs/heads/lake2&rev=a2075cfd655a106b7fc5c15f18acc7bdef1bc0c1... installing the boot loader... setting up /etc... updating GRUB 2 menu... installing the GRUB 2 boot loader on /dev/disk/by-id/wwn-0x55cd2e414d53563a... Installing for i386-pc platform. Installation finished. No error reported. setting up /etc... setting up /etc... setting root password... New password: Retype new password: passwd: password updated successfully installation finished! ``` The disk is marked as bootable: ``` lake2# fdisk -l /dev/sda Disk /dev/sda: 223,57 GiB, 240057409536 bytes, 468862128 sectors Disk model: INTEL SSDSC2BB24 Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Disklabel type: dos Disk identifier: 0xf9e2e1be Device Boot Start End Sectors Size Id Type /dev/sda1 * 2048 453236735 453234688 216,1G 83 Linux /dev/sda2 453236736 468860927 15624192 7,5G 82 Linux swap / Solaris ``` And the GRUB entry seems coherent: ``` ... terminal_input --append serial terminal_output --append serial menuentry "NixOS - Default" --class nixos --unrestricted { search --set=drive1 --fs-uuid 0fd0a970-1083-451a-8ba5-bd529cb23689 search --set=drive2 --fs-uuid 0fd0a970-1083-451a-8ba5-bd529cb23689 linux ($drive2)/nix/store/6zfnb5w30jy06qx6w0hr9zyr7rwzalz6-linux-6.4.11/bzImage init=/nix/store/f2qjfiqd48mdjv3m04w5cd85l31q2p73-nixos-system-lake2-23.11.20230819.d680ded/init console=tty1 console=ttyS0,115200 loglevel=4 initrd ($drive2)/nix/store/71z1624fpj0l6s786fg0agyxa084xliz-initrd-linux-6.4.11/initrd } submenu "NixOS - All configurations" --class submenu { menuentry "NixOS - Configuration 1 (2023-08-25 - 23.11.20230819.d680ded)" --class nixos { search --set=drive1 --fs-uuid 0fd0a970-1083-451a-8ba5-bd529cb23689 search --set=drive2 --fs-uuid 0fd0a970-1083-451a-8ba5-bd529cb23689 linux ($drive2)/nix/store/6zfnb5w30jy06qx6w0hr9zyr7rwzalz6-linux-6.4.11/bzImage init=/nix/store/f2qjfiqd48mdjv3m04w5cd85l31q2p73-nixos-system-lake2-23.11.20230819.d680ded/init console=tty1 console=ttyS0,115200 loglevel=4 initrd ($drive2)/nix/store/71z1624fpj0l6s786fg0agyxa084xliz-initrd-linux-6.4.11/initrd } } ``` And generated fstab looks good too: ``` # This is a generated file. Do not edit! # # To make changes, edit the fileSystems and swapDevices NixOS options # in your /etc/nixos/configuration.nix file. # # <file system> <mount point> <type> <options> <dump> <pass> # Filesystems. /dev/disk/by-label/nixos / ext4 x-initrd.mount 0 1 10.0.40.30:/home /home nfs nfsvers=3,rsize=1024,wsize=1024,cto,nofail 0 0 none /sys/kernel/tracing tracefs defaults 0 0 # Swap devices. /dev/disk/by-label/swap none swap defaults ``` Let's try luck, I won't be able to see the output via serial, so if I don't get a shell via SSH something went wrong. Rebooting...
rarias commented 2023-08-25 14:27:05 +02:00 (Migrated from pm.bsc.es)

GRUB loaded!


                           GNU GRUB  version 2.12~rc1

 +----------------------------------------------------------------------------+
 |*NixOS - Default                                                            |
 | NixOS - All configurations                                                 |
 |                                                                            |
 |                                                                            |
 |                                                                            |
 |                                                                            |
 |                                                                            |
 |                                                                            |
 |                                                                            |
 |                                                                            |
 |                                                                            |
 |                                                                            |
 +----------------------------------------------------------------------------+

      Use the ^ and v keys to select which entry is highlighted.
      Press enter to boot the selected OS, `e' to edit the commands
      before booting or `c' for a command-line.
   The highlighted entry will be executed automatically in 0s.

Network is up:

From hut (10.0.40.7) icmp_seq=3279 Destination Host Unreachable
From hut (10.0.40.7) icmp_seq=3280 Destination Host Unreachable
From hut (10.0.40.7) icmp_seq=3281 Destination Host Unreachable
From hut (10.0.40.7) icmp_seq=3282 Destination Host Unreachable
64 bytes from oss02 (10.0.40.42): icmp_seq=3283 ttl=64 time=1023 ms
64 bytes from oss02 (10.0.40.42): icmp_seq=3285 ttl=64 time=0.149 ms
64 bytes from oss02 (10.0.40.42): icmp_seq=3286 ttl=64 time=0.142 ms
64 bytes from oss02 (10.0.40.42): icmp_seq=3287 ttl=64 time=0.140 ms
64 bytes from oss02 (10.0.40.42): icmp_seq=3288 ttl=64 time=0.142 ms

Aaand, done:

lake2$ uptime
 14:26:27  up   0:01,  1 user,  load average: 0,79, 0,24, 0,09
lake2$ uname -a
Linux lake2 6.4.11 #1-NixOS SMP PREEMPT_DYNAMIC Wed Aug 16 16:32:31 UTC 2023 x86_64 GNU/Linux
GRUB loaded! ``` GNU GRUB version 2.12~rc1 +----------------------------------------------------------------------------+ |*NixOS - Default | | NixOS - All configurations | | | | | | | | | | | | | | | | | | | | | +----------------------------------------------------------------------------+ Use the ^ and v keys to select which entry is highlighted. Press enter to boot the selected OS, `e' to edit the commands before booting or `c' for a command-line. The highlighted entry will be executed automatically in 0s. ``` Network is up: ``` From hut (10.0.40.7) icmp_seq=3279 Destination Host Unreachable From hut (10.0.40.7) icmp_seq=3280 Destination Host Unreachable From hut (10.0.40.7) icmp_seq=3281 Destination Host Unreachable From hut (10.0.40.7) icmp_seq=3282 Destination Host Unreachable 64 bytes from oss02 (10.0.40.42): icmp_seq=3283 ttl=64 time=1023 ms 64 bytes from oss02 (10.0.40.42): icmp_seq=3285 ttl=64 time=0.149 ms 64 bytes from oss02 (10.0.40.42): icmp_seq=3286 ttl=64 time=0.142 ms 64 bytes from oss02 (10.0.40.42): icmp_seq=3287 ttl=64 time=0.140 ms 64 bytes from oss02 (10.0.40.42): icmp_seq=3288 ttl=64 time=0.142 ms ``` Aaand, done: ``` lake2$ uptime 14:26:27 up 0:01, 1 user, load average: 0,79, 0,24, 0,09 lake2$ uname -a Linux lake2 6.4.11 #1-NixOS SMP PREEMPT_DYNAMIC Wed Aug 16 16:32:31 UTC 2023 x86_64 GNU/Linux ```
rarias commented 2023-08-25 14:28:03 +02:00 (Migrated from pm.bsc.es)

marked the checklist item Add NixOS entry to GRUB as completed

marked the checklist item **Add NixOS entry to GRUB** as completed
rarias commented 2023-08-25 14:53:36 +02:00 (Migrated from pm.bsc.es)

Lake2 now has visibilty with the ceph cluster:

lake2$ sudo ceph -s
  cluster:
    id:     9c8d06e0-485f-4aaf-b16b-06d6daf1232b
    health: HEALTH_OK

  services:
    mon: 1 daemons, quorum bay (age 2d)
    mgr: bay(active, since 2d)
    mds: 1/1 daemons up, 1 standby
    osd: 4 osds: 4 up (since 2d), 4 in (since 3w)

  data:
    volumes: 1/1 healthy
    pools:   4 pools, 97 pgs
    objects: 25 objects, 1.5 MiB
    usage:   310 MiB used, 1.5 TiB / 1.5 TiB avail
    pgs:     97 active+clean
Lake2 now has visibilty with the ceph cluster: ``` lake2$ sudo ceph -s cluster: id: 9c8d06e0-485f-4aaf-b16b-06d6daf1232b health: HEALTH_OK services: mon: 1 daemons, quorum bay (age 2d) mgr: bay(active, since 2d) mds: 1/1 daemons up, 1 standby osd: 4 osds: 4 up (since 2d), 4 in (since 3w) data: volumes: 1/1 healthy pools: 4 pools, 97 pgs objects: 25 objects, 1.5 MiB usage: 310 MiB used, 1.5 TiB / 1.5 TiB avail pgs: 97 active+clean ```
rarias commented 2023-08-25 15:43:48 +02:00 (Migrated from pm.bsc.es)

marked the checklist item Configure the disks for ceph as completed

marked the checklist item **Configure the disks for ceph** as completed
rarias commented 2023-08-25 16:44:08 +02:00 (Migrated from pm.bsc.es)

Writting several GB causes the write operation to hang:

$ strace -f dd if=/dev/urandom of=/ceph/rarias/kk bs=1M count=$((16*1024)) status=progress
...
read(0, "s\26\371B&6 \316\372\350\202\365Nc\f\347?+\365\23?H0\342\241Y\225\240\23R\220\217"..., 1048576) = 1048576
write(1, "s\26\371B&6 \316\372\350\202\365Nc\f\347?+\365\23?H0\342\241Y\225\240\23R\220\217"..., 1048576
<stuck here forever>

hut$ sudo cat /proc/1402180/stack
[<0>] wait_woken+0x54/0x70
[<0>] ceph_get_caps+0x4b3/0x6f0 [ceph]
[<0>] ceph_write_iter+0x316/0xdc0 [ceph]
[<0>] vfs_write+0x22e/0x3f0
[<0>] ksys_write+0x6f/0xf0
[<0>] do_syscall_64+0x3e/0x90
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0xe1

lake2$ sudo ceph fs get cephfs | grep max_mds
max_mds 1

Relevant issue: https://tracker.ceph.com/issues/54044

Maybe upgrading the mds to ceph 18 fixes the problem.

Writting several GB causes the write operation to hang: ``` $ strace -f dd if=/dev/urandom of=/ceph/rarias/kk bs=1M count=$((16*1024)) status=progress ... read(0, "s\26\371B&6 \316\372\350\202\365Nc\f\347?+\365\23?H0\342\241Y\225\240\23R\220\217"..., 1048576) = 1048576 write(1, "s\26\371B&6 \316\372\350\202\365Nc\f\347?+\365\23?H0\342\241Y\225\240\23R\220\217"..., 1048576 <stuck here forever> hut$ sudo cat /proc/1402180/stack [<0>] wait_woken+0x54/0x70 [<0>] ceph_get_caps+0x4b3/0x6f0 [ceph] [<0>] ceph_write_iter+0x316/0xdc0 [ceph] [<0>] vfs_write+0x22e/0x3f0 [<0>] ksys_write+0x6f/0xf0 [<0>] do_syscall_64+0x3e/0x90 [<0>] entry_SYSCALL_64_after_hwframe+0x77/0xe1 lake2$ sudo ceph fs get cephfs | grep max_mds max_mds 1 ``` Relevant issue: https://tracker.ceph.com/issues/54044 Maybe upgrading the mds to ceph 18 fixes the problem.
rarias commented 2023-08-25 18:26:17 +02:00 (Migrated from pm.bsc.es)

There is a PR to update ceph to 18.2.0 but it didn't land on nixpkgs yet: https://github.com/NixOS/nixpkgs/pull/247849

I took the derivation and placed it in a overlay, let's see if I can upgrade ceph.

There is a PR to update ceph to 18.2.0 but it didn't land on nixpkgs yet: https://github.com/NixOS/nixpkgs/pull/247849 I took the derivation and placed it in a overlay, let's see if I can upgrade ceph.
rarias commented 2023-08-25 19:14:50 +02:00 (Migrated from pm.bsc.es)

Upgraded to 18.2.0, still hangs after 21 GB:

hut$ dd if=/dev/zero of=/ceph/rarias/kk bs=1M count=$((32*1024)) status=progress
22137536512 bytes (22 GB, 21 GiB) copied, 24 s, 922 MB/s
...
[<0>] wait_woken+0x54/0x70
[<0>] ceph_get_caps+0x4b3/0x6f0 [ceph]
[<0>] ceph_write_iter+0x316/0xdc0 [ceph]
[<0>] vfs_write+0x22e/0x3f0
[<0>] ksys_write+0x6f/0xf0
[<0>] do_syscall_64+0x3e/0x90
[<0>] entry_SYSCALL_64_after_hwframe+0x77/0xe1
Upgraded to 18.2.0, still hangs after 21 GB: ``` hut$ dd if=/dev/zero of=/ceph/rarias/kk bs=1M count=$((32*1024)) status=progress 22137536512 bytes (22 GB, 21 GiB) copied, 24 s, 922 MB/s ... [<0>] wait_woken+0x54/0x70 [<0>] ceph_get_caps+0x4b3/0x6f0 [ceph] [<0>] ceph_write_iter+0x316/0xdc0 [ceph] [<0>] vfs_write+0x22e/0x3f0 [<0>] ksys_write+0x6f/0xf0 [<0>] do_syscall_64+0x3e/0x90 [<0>] entry_SYSCALL_64_after_hwframe+0x77/0xe1 ```
rarias commented 2023-08-25 20:24:32 +02:00 (Migrated from pm.bsc.es)

marked the checklist item Set default boot entry to NixOS as completed

marked the checklist item **Set default boot entry to NixOS** as completed
rarias commented 2023-08-25 20:24:39 +02:00 (Migrated from pm.bsc.es)

changed the description

changed the description
rarias commented 2023-08-28 08:56:35 +02:00 (Migrated from pm.bsc.es)

mentioned in issue #29

mentioned in issue #29
rarias commented 2023-08-28 18:03:30 +02:00 (Migrated from pm.bsc.es)

mentioned in merge request !20

mentioned in merge request !20
rarias commented 2023-08-30 17:18:01 +02:00 (Migrated from pm.bsc.es)

Adding the lake1 (oss01) NVME disks in lake2 and bay won't work, as only the last 4 bays support NVME:

nvme

Adding the lake1 (oss01) NVME disks in lake2 and bay won't work, as only the last 4 bays support NVME: ![nvme](/uploads/1a3f664be37051529e2beee889b2f3f4/nvme.png)
rarias commented 2023-09-04 12:17:38 +02:00 (Migrated from pm.bsc.es)

Without root_squash runs fine:

hut% fio --filename=/ceph/rarias/fio1 --size=10GB --direct=1 --rw=randrw --bs=64k --ioengine=libaio --iodepth=64 --runtime=120 --numjobs=4 --time_based --name thr --group_reporting
thr: (g=0): rw=randrw, bs=(R) 64.0KiB-64.0KiB, (W) 64.0KiB-64.0KiB, (T) 64.0KiB-64.0KiB, ioengine=libaio, iodepth=64
...
fio-3.35
Starting 4 processes
Jobs: 4 (f=4): [m(4)][100.0%][r=87.8MiB/s,w=90.7MiB/s][r=1404,w=1451 IOPS][eta 00m:00s]
thr: (groupid=0, jobs=4): err= 0: pid=4517: Mon Sep  4 12:11:43 2023
  read: IOPS=1455, BW=91.0MiB/s (95.4MB/s)(10.7GiB/120069msec)
    slat (usec): min=11, max=776, avg=38.94, stdev=18.64
    clat (usec): min=1106, max=449961, avg=85410.20, stdev=128413.63
     lat (usec): min=1145, max=449986, avg=85449.14, stdev=128412.94
    clat percentiles (usec):
     |  1.00th=[  1926],  5.00th=[  2769], 10.00th=[  3752], 20.00th=[  6849],
     | 30.00th=[ 10028], 40.00th=[ 13304], 50.00th=[ 17695], 60.00th=[ 25035],
     | 70.00th=[ 40633], 80.00th=[261096], 90.00th=[333448], 95.00th=[354419],
     | 99.00th=[379585], 99.50th=[387974], 99.90th=[404751], 99.95th=[413139],
     | 99.99th=[434111]
   bw (  KiB/s): min=72704, max=115584, per=99.99%, avg=93136.53, stdev=2016.83, samples=960
   iops        : min= 1136, max= 1806, avg=1455.25, stdev=31.51, samples=960
  write: IOPS=1454, BW=90.9MiB/s (95.3MB/s)(10.7GiB/120069msec); 0 zone resets
    slat (usec): min=16, max=3354, avg=54.62, stdev=33.14
    clat (msec): min=2, max=451, avg=90.40, stdev=127.50
     lat (msec): min=2, max=451, avg=90.46, stdev=127.50
    clat percentiles (msec):
     |  1.00th=[    5],  5.00th=[    7], 10.00th=[    9], 20.00th=[   12],
     | 30.00th=[   15], 40.00th=[   19], 50.00th=[   24], 60.00th=[   32],
     | 70.00th=[   48], 80.00th=[  268], 90.00th=[  338], 95.00th=[  355],
     | 99.00th=[  384], 99.50th=[  393], 99.90th=[  409], 99.95th=[  414],
     | 99.99th=[  435]
   bw (  KiB/s): min=69120, max=118912, per=99.98%, avg=93099.73, stdev=2137.25, samples=960
   iops        : min= 1080, max= 1858, avg=1454.68, stdev=33.39, samples=960
  lat (msec)   : 2=0.67%, 4=5.13%, 10=16.40%, 20=26.47%, 50=23.15%
  lat (msec)   : 100=5.53%, 250=2.11%, 500=20.56%
  cpu          : usr=1.07%, sys=3.53%, ctx=309828, majf=0, minf=45
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=99.9%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=174754,174691,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=91.0MiB/s (95.4MB/s), 91.0MiB/s-91.0MiB/s (95.4MB/s-95.4MB/s), io=10.7GiB (11.5GB), run=120069-120069msec
  WRITE: bw=90.9MiB/s (95.3MB/s), 90.9MiB/s-90.9MiB/s (95.3MB/s-95.3MB/s), io=10.7GiB (11.4GB), run=120069-120069msec

So let's use this until #29 gets fixed.

Without `root_squash` runs fine: ``` hut% fio --filename=/ceph/rarias/fio1 --size=10GB --direct=1 --rw=randrw --bs=64k --ioengine=libaio --iodepth=64 --runtime=120 --numjobs=4 --time_based --name thr --group_reporting thr: (g=0): rw=randrw, bs=(R) 64.0KiB-64.0KiB, (W) 64.0KiB-64.0KiB, (T) 64.0KiB-64.0KiB, ioengine=libaio, iodepth=64 ... fio-3.35 Starting 4 processes Jobs: 4 (f=4): [m(4)][100.0%][r=87.8MiB/s,w=90.7MiB/s][r=1404,w=1451 IOPS][eta 00m:00s] thr: (groupid=0, jobs=4): err= 0: pid=4517: Mon Sep 4 12:11:43 2023 read: IOPS=1455, BW=91.0MiB/s (95.4MB/s)(10.7GiB/120069msec) slat (usec): min=11, max=776, avg=38.94, stdev=18.64 clat (usec): min=1106, max=449961, avg=85410.20, stdev=128413.63 lat (usec): min=1145, max=449986, avg=85449.14, stdev=128412.94 clat percentiles (usec): | 1.00th=[ 1926], 5.00th=[ 2769], 10.00th=[ 3752], 20.00th=[ 6849], | 30.00th=[ 10028], 40.00th=[ 13304], 50.00th=[ 17695], 60.00th=[ 25035], | 70.00th=[ 40633], 80.00th=[261096], 90.00th=[333448], 95.00th=[354419], | 99.00th=[379585], 99.50th=[387974], 99.90th=[404751], 99.95th=[413139], | 99.99th=[434111] bw ( KiB/s): min=72704, max=115584, per=99.99%, avg=93136.53, stdev=2016.83, samples=960 iops : min= 1136, max= 1806, avg=1455.25, stdev=31.51, samples=960 write: IOPS=1454, BW=90.9MiB/s (95.3MB/s)(10.7GiB/120069msec); 0 zone resets slat (usec): min=16, max=3354, avg=54.62, stdev=33.14 clat (msec): min=2, max=451, avg=90.40, stdev=127.50 lat (msec): min=2, max=451, avg=90.46, stdev=127.50 clat percentiles (msec): | 1.00th=[ 5], 5.00th=[ 7], 10.00th=[ 9], 20.00th=[ 12], | 30.00th=[ 15], 40.00th=[ 19], 50.00th=[ 24], 60.00th=[ 32], | 70.00th=[ 48], 80.00th=[ 268], 90.00th=[ 338], 95.00th=[ 355], | 99.00th=[ 384], 99.50th=[ 393], 99.90th=[ 409], 99.95th=[ 414], | 99.99th=[ 435] bw ( KiB/s): min=69120, max=118912, per=99.98%, avg=93099.73, stdev=2137.25, samples=960 iops : min= 1080, max= 1858, avg=1454.68, stdev=33.39, samples=960 lat (msec) : 2=0.67%, 4=5.13%, 10=16.40%, 20=26.47%, 50=23.15% lat (msec) : 100=5.53%, 250=2.11%, 500=20.56% cpu : usr=1.07%, sys=3.53%, ctx=309828, majf=0, minf=45 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=99.9% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0% issued rwts: total=174754,174691,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=64 Run status group 0 (all jobs): READ: bw=91.0MiB/s (95.4MB/s), 91.0MiB/s-91.0MiB/s (95.4MB/s-95.4MB/s), io=10.7GiB (11.5GB), run=120069-120069msec WRITE: bw=90.9MiB/s (95.3MB/s), 90.9MiB/s-90.9MiB/s (95.3MB/s-95.3MB/s), io=10.7GiB (11.4GB), run=120069-120069msec ``` So let's use this until #29 gets fixed.
rarias commented 2023-09-12 12:33:06 +02:00 (Migrated from pm.bsc.es)

marked the checklist item Do some benchmarks as completed

marked the checklist item **Do some benchmarks** as completed
Sign in to join this conversation.
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: rarias/jungle#28
No description provided.