Project

General

Profile

ethernet failure

Added by Dennis Volper over 14 years ago

Background: booted L138 board, ran ifconfig, got "Segmentation fault". Followed steps to reload jffs2 from dvd image from the "Linux Root File System" page (20100509 seems current from the links). ifconfig now works, but it produces an empty route table and it can't ping or be pinged. Rebooted to the ROM, setenv on the ipaddr, saved, ran ping, it worked.
Several tests later, borrowed one of our other L138 boards from another programmer, booted, ifconfig'd; the route table was there and ping worked both ways. When I reloaded the jffs2 I got 5 "Skipping bad blocks" on my board, followed by
two bad blocks (004a0000, 00520000) when I "wrote" the jffs2, don't know whether that is related. Bottom line I need to find what is wrong with the particular board I'm using. I'm new to jffs2, when you boot from disk, the kernel from the disk image is used, when you boot from bootp the kernel outside the initrd image is used; I'm not sure how it is with jffs2, I may need to reload the kernel, however I found rebuild instructions, but "Binary Releases" section of the "Arm9 Kernel" page of the Wiki is empty and the "a pre-compiled Kernel tested with the Critical Link Industrial I/O board" mentioned on the "2010 05 Software release page" isn't a link. Any ideas?


Replies (8)

RE: ethernet failure - Added by Michael Williamson over 14 years ago

There are several items here. I can answer a couple of things but I will need to get more information to you tomorrow (including links to the pre-built u-boot and kernel images).

  1. The way the "stock" MityDSP-L138/F is shipped has the kernel always being loaded from SPI-NOR flash, regardless of where the root filesystem is mounted. You can see the loading of the kernel if you catch the u-Boot (hit a key) and then print the bootcmd environment variable out. It's unlikely that the Kernel can be corrupted in the SPI memory. There may be kernel image present on the NAND root-filesystem, but by default this is not the kernel that is loaded. This is done to support NFS level booting without the need to configure on-board NAND flash. Also, the u-Boot jffs2 NAND driver is not very fast, so retrieving the kernel from a JFFS2 NAND filesystem is rather slow.
  2. For the nand write.jffs2 operations, the "skipping bad blocks" message is OK (even "good"). NAND will intrinsically have bad blocks, that's why it's so cheap (compared to NOR based FLASH memory). The manufacturer gaurantees a certain number of good blocks, and will mark the bad ones on the device. JFFS2 is designed to "work around" the bad blocks without file corruption by not using (skipping over) the bad blocks. The message you are seeing simply means that the u-boot writer found a bad block, but is passing over it -- and JFFS2 will be set up to deal with that. You will see different bad blocks for each different SOM module.

Now, that being said, the JFFS2 filesystem is by default being mounted as read/write and the block sizes are fairly large. That means that if you power-off your unit while it is writing to disk (FLASH - JFFS2) you can corrupt your root filesystem and the executables that live on it. So at the moment, you need to do a "poweroff" or a "shutdown" or a "reboot" to ensure that the rootfilesystem is unmounted cleanly before powering off. We're looking at ways to deal with this, though there is a lot of "google-able" material about ways to deal with this, including:

  • Mounting the root filesystem as read-only (e.g., "mount -o remount,ro /" in a rc.local script). This still requires system stability while booting.
  • Using an alternate filesystem for the root filesystem on the NAND or even the SPI-NOR, such as read-only CRAMFS or SQUASHFS
  • configuring the kernel to mount the root filesystem read-only

Note that several ram based filesystems are used for /tmp, /var, etc., already. One quick thing that will reduce the likelyhood of corruption during boot or otherwise (in the interim) is to add the "noatime" option to the root filesystem in both the u-boot kernel command line (replace "rw" with "rw,noatime" as well as in the /etc/fstab. This will remove writing accesstimes to the filesystem for all executables that are launched during normal operation.

As for the ethernet issue, I beleive that there may be a recent bug fix in the kernel as well as the u-boot image for the PHY detection/probing logic that may apply to early delivered units. The source for this should be available on the git repositories. I will post more details on that in a follow up post.

RE: ethernet failure - Added by Dennis Volper over 14 years ago

1) I don't mind the jffs2 being read/write, it means I can easily set the board up to NFS mount the machine we are compiling on just like I would on a disk based Linux box. Thanks for the warning, I'll just treat Flash the same way I treat disk.
2) I seem to remember seeing a link on how to SPI the kernel, but I can't find it now; I've got setting up the build, building the kernel, need the download and burn instructions similar to what is given for the jffs2.
3) Trying the "git" for the 4th time. I'm up to 15%, running at 1 KiB/s; the last 3 times I've gotten a server disconnected message. git doesn't seems to have to start over at the beginning when that happens.

RE: ethernet failure - Added by Michael Williamson over 14 years ago

I added a section called Installing the Kernel in the Linux_Kernel section of the Wiki. This covers flashing a newly built kernel into SPI-NOR flash.

RE: ethernet failure - Added by Dennis Volper over 14 years ago

Any hint on the "pre-compiled Kernel tested..."?

RE: ethernet failure - Added by Michael Williamson over 14 years ago

I placed images of the precompiled u-boot and the kernel in the files section of this support site.

Please check, when your board is booted, that the PHY address reported is "0x03".

You can see this in the u-boot printout:

...
I2C:   ready
DRAM:  128 MB
NAND:  256 MiB
In:    serial
Out:   serial
Err:   serial
ARM    Clock : 300000000 Hz
DDR    Clock : 132000000 Hz
EMIFA  CLock : 100000000 Hz
DSP    Clock : 300000000 Hz
ASYNC3 Clock : 150000000 Hz
Enet config  : 2
Resetting ethernet phy
Net:   Ethernet PHY: GENERIC @ 0x03 [0x8]
...

RE: ethernet failure - Added by Dennis Volper over 14 years ago

Got the 2nd and 3rd boards up and on the network with no problem. Still having trouble with the 1st board. What I see is:

DRAM: 64 MB
... other stuff matches
Resetting ethernet phy
Net: Ethernet PHY: GENERIC @ 0x03
...
U-Boot> boot
... lots of stuff
emac-mii: probed
...
eth0: no PHY found
net eth0: Davinci EMAC: request_irq() failed

I'm missing the "[0x8]" in the uboot area
and the "no PHY found" is probably why Linux
can't find the network.

I've replaced the jffs2 (twice) and the kernel (once).
Any suggestions on how to proceed.

RE: ethernet failure - Added by Tim Iskander over 14 years ago

Dennis,
Sounds like you need up update the uboot image on that board. I think you got that one before the others and there were some changes
made to uboot for this issue. You will want to run
factoryconfig
and record the mac address and serial number before upgrading.
To upgrade the uboot image, see Das_U-Boot_Port
You will need to re-run the
factoryconfig set
factoryconfig save
and
config set
config save
commands after upgrading. Make sure the mac and s/n portion of the factoryconfig are correct, and then you can just keep
pressing enter to accept the existing/default values on the rest (including all of the config settings).
cheers
/Tim

RE: ethernet failure - Added by Dennis Volper over 14 years ago

The new uboot worked. Problem solved. I can now scp a program onto the board
and run it.

    (1-8/8)
    Go to top
    Add picture from clipboard (Maximum size: 1 GB)