Switch doesn't reply on Ethernet ports #1
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
The Dell S3048-ON 10 Gpbs with 48 ports switch stop replied on the Ethernet ports after the power surge on 2023-03-28. A new 1Gbps Cisco switch has been used to connect the nodes in the meanwhile. I requested the old switch to inspect the power supply and see if is easy to fix.
It will be beneficial to have this switch back online as it has a higher speed and the NFS is mounted through this network.
The power supply 1 delivers 12 volts as expected by the OK led.
The Ethernet port on the left front used for serial connection doesn't setup a link with my laptop, and neither any of the Ethernet ports.
The main board starts and turns off the LEDs on the disk, CPU and setups the ethernet link, but reboots off after a few seconds.
Voltage across GND and TX pin in J4 is 5.5 V, and TX and pin 4 report 0 Volts.
The best course of action seems to be to connect a serial port to J4 and try to get a console so we can see the problems during boot. A possible explanation is that the file system (in PCI express NVRAM) may be bad and it cannot fully boot.
The flash is a mSATA 3IE3 of 8GiB which also seems to have a serial port and a jumper.
The blue LED turns on with the board and blinks.
No response on external serial port at 115200 or 9600 bauds while pressing enter. Apparently I'm using this cable for Cisco but they seem to use the same pinout.
Also the output serial port doesn't provide any voltage in any pin, which shouldn't be the case.
I should try the J4 connector.
Connected via external USB:
BIOS reports some EFI problems:
Diagnostics segfaults:
Similar problem as: https://www.dell.com/community/Networking-General/S3048-ON-died-after-a-shutdown-and-reboot/m-p/8057901
Let's try resetting the BIOS.
Still failing on POST:
Let's try booting ONIE rescue mode. Loads:
dmesg is available:
BIOS is "Primary BIOS 3.24.0.0-9" so we may try an update:
https://www.dell.com/support/home/en-ae/drivers/driversdetails?driverid=vm23r&oscode=naa&productcode=force10-s3048-on
For 3.24.5.0-11
BIOS chip is Winbond as documented in the update documentation:
And I can see the chip in the corner, along with a jumper connector J1006.
Here is the datasheet: https://www.winbond.com/hq/support/documentation/levelOne.jsp?__locale=en&DocNo=DA00-W25Q64FV
Starting:
Programming BIOS flash: s3000-bios-3.24.0.0-11.bin ...
Lattice reprogrammed:
Somehow it fails to reboot:
So lets reboot it manually.
Great, now it reboots a few times and then fails.
ONIE rescue still boots. Here is dmidecode output:
There is no i2c-2 device:
So, one of the possible explanations is that the MAC address is not being load from the EEPROM because there is no i2c-2 device:
Here is explained: https://opencomputeproject.github.io/onie/design-spec/hw_requirements.html#board-eeprom-information-format
The onie-syseeprom binary has harcoded the i2c device to be the number 2. However I can see that it tries to write to the 0x73 port:
Which happens to be available on the i2c-1 device:
Interestingly, setting the device to 1 shows the CRC correctly:
Ethernet on top of the board (inside) works when manually adding the IP and allows a root shell via SSH.
DCLI loads and complains:
After connecting the fans to the board (which make a terrible noise) now the PCI POST works fine. Then the switch boots properly and turns on the LEDs.
Here is the full boot:
Quick check to test speed from xeon07 to ssfhead before changing the switch:
Close to 1Gpbs (minus SSH overhead).
Switch is back! Installed and working with the two power supplies:
However, the speed is still 1 Gpbs:
And the previous test barely changes:
This is correct, as this switch only has 10 Gbps in the so called Uplink port (optic fiber) which is used to interconnect the switches:
https://www.delltechnologies.com/asset/en-us/products/networking/technical-support/dell-networking-s3048-on-spec-sheet.pdf
So, this can be closed now. Let see how long it keeps running.