September 25, 2017, 12:05:39 pm
News:
Pages: [1] 2
Print
Author Topic: The continuing saga of pfft and his OSD(s)...  (Read 6950 times)
pfft2001
Sr. Member
****
Posts: 378



View Profile
« on: January 02, 2011, 07:22:48 am »

This post is more or less FYI - but I am curious about the recovery question.

The other day, one of my OSDs crashed (as a result of something I was working on), so I needed to power-cycle it.  Alas, it would not reboot (it hangs on the first "Orange Circle" - and yes, I did wait a long time; I went out of the room to do something else, but it stayed hung).  So, I decided to EFU it, so I loaded up my SD card with OSDng and booted the machine with the SD card inserted.  It started up - got to the "ping pong" screen, then crashed with "Sorry - Package Error".  I tried it a couple more times, with same result, then decided that maybe you can't load OSDng this way (because all the examples show loading OSDng via the Browse option in the GUI), so I tried the regular Arizona.  Still got Package Error.

Now, I'm beginning to panic, as it went back to the "Orange Circle" after each failed Package attempt.  However, miraculously, I guess, the last time, after sitting at the Orange Circle for a bit, it went ahead and booted (and booted into OSDng).  So, whatever problem was there seems to have gone away.

Still, it worries me.  I've never had this problem (Package Error) before.  What causes it?  And what caused it to go away?  And what is recovery if the problem doesn't go away???
Logged
heyrick
Global Moderator
Sr. Member
*****
Posts: 340



View Profile WWW
« Reply #1 on: January 02, 2011, 03:46:30 pm »

Alas, it would not reboot (it hangs on the first "Orange Circle" - and yes, I did wait a long time; I went out of the room to do something else, but it stayed hung).

Very strange. I wonder if mucking up the little area of data used by U-boot could lead to something like this?


Quote
so I tried the regular Arizona.  Still got Package Error.

<gulp!>


Quote
Now, I'm beginning to panic,

I would be, too. Moreso as I don't have any other OSD to use instead.


Quote
Still, it worries me.  I've never had this problem (Package Error) before.  What causes it?  And what caused it to go away?  And what is recovery if the problem doesn't go away???

I'd be worried too. Unexplained things are very disturbing. I hate to mention words like "haunted", but I've seen enough weird things to make me wonder if the spirit world doesn't once in a while stick its finger into our electronic world and wiggle it, see what happens...

Okay, it looks like the package errors are all in .../neuros-bsp/bootloader/board/ntosd-dm320/image_reload.c and it is all the usual suspects - file too short, too long, naff header, wrong firmware type, blah blah. To be honest, I would be inclined to look for, and keep, a dead simple 1Gb SD card for restoration purposes. Just because an SD works in the OSD normally doesn't mean it will work in the boot reflash (look at SDHC, for instance).

If the problem doesn't go away, the next step for you is probably to hook the serial into a computer and poke around in U-Boot directly. As you're getting the orange circle, you're at least having the OSD initialise itself. That might not be a lot of comfort, but it is better than a "bricked" unit. And, from this point, you'd be best off talking to people way smarter than me. I have zero experience doing a recovery this way, and given as I only have the one OSD (that I use a lot), I don't plan on experimenting.


That said - if anybody has an unwanted OSD (any state so long as it actually boots!), then get in touch with me.


Best wishes,

Rick.
Logged
ChadV
Administrator
Hero Member
*****
Posts: 1611


View Profile WWW
« Reply #2 on: January 03, 2011, 09:51:14 am »

I can't think of anything that Rick didn't already mention...
Logged
pfft2001
Sr. Member
****
Posts: 378



View Profile
« Reply #3 on: January 03, 2011, 11:40:33 am »

I can't think of anything that Rick didn't already mention...

OK - well I'm sure if you do, you'll let me know.

As stated, what I'm most interested in is: If it boots to the Orange Circle and won't go any further (at some point in the future), what should I do?

Also, I wanted to add that the SD card that I use to do the EFU is the same one I've always used - it has always worked in the past (for EFU purposes).  It is an 8Gb SDHC card.  I know that I've read here that SDHC might not work (for EFU purposes), but it has always worked for me (until now...)
Logged
greyback
Administrator
Hero Member
*****
Posts: 1639


View Profile
« Reply #4 on: January 03, 2011, 01:16:09 pm »

I'd suggest using the last 3.33-1.77 firmware for an Emergency Update. Don't forget the disable_upk_version_check file too.

Then you can do an update to OSDng I guess.

Reason for this thinking is that the Arizona UPK files are more complex (with the CF card data), and I'm not 100% certain that the uboot updater is able to deal with them.
-G
Logged
heyrick
Global Moderator
Sr. Member
*****
Posts: 340



View Profile WWW
« Reply #5 on: January 03, 2011, 04:33:24 pm »

I'd agree with greyback - get something 'normal' on the box first, then upgrade to OSDng.

I am wondering... there's data in /dev/mtblock2 which would appear to control U-Boot's startup. Perhaps if this somehow got messed up, it could throw things off.

Grab a copy of the contents with:
$ head /dev/mtdblock2 > mtdblock2.txt

Then pull it over to your computer using the built-in web server (or cp it to removable media...). You'll then probably need to use a hex editor to blank the first four bytes, and clip off all the null padding. This is because Windows will most likely misread the file and display a load of Chinese. Then, with what is left (5K out of 128K), change all 0x00 bytes to 0x0A.
Save.

Now you can load the file into a text editor. I recommend Metapad, or anything that isn't Notepad and understands Unix line termination.

Why? Because if the emergency upgrade stops working, you *may* need to try talking to U-Boot directly. Or at least look to see what its environment is.

Here is (my) OSDng one, should you have need to refer to it:
--8<--------
bootdelay=3
baudrate=115200
hostname=neuros
bootfile=
update-rootfs=run rootfs_flash_loc_cmd;tftp $(loader_addr) $(rootfs_flash_loc);erase 008C0000 008DFFFF;cp.b $(loader_addr) 008C0000 $(filesize);setenv rootfs_flash_size $(filesize);setenv filesize;saveenv
flash_boot=run ramfs_cmd;cp.b 008C0000 $(loader_addr) $(rootfs_flash_size);bootm $(kernel_addr) $(loader_addr)
rootfs_flash_loc=/images/initrd.boot
nfs_serverip=192.168.1.1
bootargs=console=ttyS0,115200n8 root=/dev/mtdblock4 ro ip=192.168.1.100:192.168.1.1:192.168.1.1:255.0.0.0:neuros::off mem=14M
rootfs_flash_cramfs_loc=/images/root.cramfs
fileaddr=1800000
gatewayip=192.168.1.1
netmask=255.0.0.0
ipaddr=192.168.1.100
serverip=192.168.1.1
cur_uboot=2.09-00.871
cur_kernel=2.09-00.871
package_dir=/mnt/tmpfs/mount_USB/osdng-dev-2.52.upk
ethaddr=00:18:11:80:34:d9
sn=A5840709024F
loader_addr=0x01800000
kernel_addr=00160000
tftp_root=/images
nfs_root=/home/bcarnes/osd/trunk/neuros-bsp/rootfs/fs
defenv_fname=default_env.img
defenv_loc=/home/bcarnes/osd/trunk/neuros-bsp/images/default_env.img
defenv_loc_cmd=setenv defenv_loc $(tftp_root)/$(defenv_fname)
bootloader_loc=/home/bcarnes/osd/trunk/neuros-bsp/images/u-boot.bin
bootloader_fname=u-boot.bin
bootloader_loc_cmd=setenv bootloader_loc $(tftp_root)/$(bootloader_fname)
kernel_fname=uImage
kernel_loc=/home/bcarnes/osd/trunk/neuros-bsp/images/uImage
kernel_loc_cmd=setenv kernel_loc $(tftp_root)/$(kernel_fname)
rootfs_flash_fname=initrd.boot
rootfs_flash_loc_cmd=setenv rootfs_flash_loc $(tftp_root)/$(rootfs_flash_fname)
rootfs_flash_cramfs_fname=root.cramfs
rootfs_flash_cramfs_loc_cmd=setenv rootfs_flash_cramfs_loc $(tftp_root)/$(rootfs_flash_cramfs_fname)
rootfs_nfs_loc=/home/bcarnes/osd/trunk/neuros-bsp/rootfs/fs
rootfs_nfs_loc_cmd=setenv rootfs_nfs_loc $(nfs_root)
console=console=ttyS0,115200n8
display=tv
mem_reserve=mem=34M
ip=ip=192.168.1.100:192.168.1.1:192.168.1.1:255.255.255.0:neuros::off
nfs_mount_params=udp,v3,rsize=4096,wsize=4096
nfs_cmd=setenv bootargs $(console) root=/dev/nfs ro nfsroot=$(nfs_serverip):$(rootfs_nfs_loc),$(nfs_mount_params) $(ip) $(mem_reserve)
ramfs_cmd=setenv bootargs $(console) root=/dev/ram0 rw $(ip) $(mem_reserve)
cffs_cmd=setenv bootargs $(console) root=/dev/hdc1 rw $(ip) $(mem_reserve)
cramfs_cmd=setenv bootargs $(console) root=/dev/mtdblock4 ro $(ip) $(mem_reserve)
update-locs=run bootloader_loc_cmd;run kernel_loc_cmd;run rootfs_flash_loc_cmd;run defenv_loc_cmd;run rootfs_nfs_loc_cmd;saveenv
update-defenv=run defenv_loc_cmd;tftp $(loader_addr) $(defenv_loc);setenv filesize;autoscr $(loader_addr);run update-locs
update-uboot=run bootloader_loc_cmd;tftp $(loader_addr) $(bootloader_loc);protect off 00100000 0013FFFF;erase 00100000 0013FFFF;cp.b $(loader_addr) 00100000 $(filesize);setenv filesize
update-kernel=run kernel_loc_cmd;tftp $(loader_addr) $(kernel_loc);erase $(kernel_addr) 002DFFFF;cp.b $(loader_addr) $(kernel_addr) $(filesize);setenv filesize
update-ipdhcp=setenv ip ip=::::$(hostname)::dhcp;saveenv
update-ipstatic=setenv ip ip=$(ipaddr):$(serverip):$(gatewayip):$(netmask):$(hostname)::off;saveenv
update-cramfs=run rootfs_flash_cramfs_loc_cmd;tftp $(loader_addr) $(rootfs_flash_cramfs_loc);erase 002E0000 00FFFFFF;cp.b $(loader_addr) 002E0000 $(filesize);setenv filesize;saveenv
update-uboot-nand=run bootloader_loc_cmd;tftp $(loader_addr) $(bootloader_loc);nand erase 80000 40000;nand write $(loader_addr) 80000 40000;setenv filesize
update-kernel-nand=run kernel_loc_cmd;tftp $(loader_addr) $(kernel_loc);nand erase E0000 180000;nand write $(loader_addr) E0000 180000;setenv filesize
update-cramfs-nand=run rootfs_flash_cramfs_loc_cmd;tftp $(loader_addr) $(rootfs_flash_cramfs_loc);nand erase 260000 D20000;nand write $(loader_addr) 260000 D20000;setenv filesize
devboot=tftp $(loader_addr) $(bootloader_loc);setenv filesize;go $(loader_addr)
devkernel=run nfs_cmd;tftp $(loader_addr) $(kernel_loc);setenv filesize;bootm $(loader_addr)
cf_boot=run cffs_cmd;bootm $(kernel_addr)
cramfs_boot=run cramfs_cmd;bootm $(kernel_addr)
nand_boot=run cramfs_cmd;nboot $(loader_addr) 0 E0000;bootm $(loader_addr)
bootcmd=run cramfs_boot
cur_rootfs=2.52
ext_apps=2.52
upgrade_flag=done
tvoutput_mode=pal

--8<--------


Anything beyond that, you could well be looking at building yourself a JTAG to poke the firmware into the box the sledgehammer way.


I'm still concerned as to why it decided to misbehave, and why it decided to stop misbehaving. It wasn't overheating, was it? And in poking around with it, you moved it away from where it was so it could cool down?


Best wishes,

Rick.
Logged
pfft2001
Sr. Member
****
Posts: 378



View Profile
« Reply #6 on: January 30, 2011, 09:05:47 am »

Well, I just had another go-around with this today.  Took me about an hour of cursing at it to get it working again.

Basically, I think the triggering event was the weekly "check for the latest version from Neuros" "feature".

That screen came up - saw that I wasn't running the latest from Neuros (since I am running OSDng, which is "post-Neuros") and decided to "upgrade" me.  After that, all heck broke loose.  So, my first question is: How can I turn off this weekly check?  (I've been living with it for years now - and have found it annoying  - but after today's adventure, it is more than annoying).  I'm pretty sure I've seen a setting in the menus somewhere to turn it off - but I couldn't find it just now.  So, if someone could inform me where it is,  that'd be nice.

Now, the next thing to note is that when this happens (and I'm sure it will happen again), the key thing is to hook up to the serial port - at least then you have some idea what's going on.  I didn't do this the last time, but this time I did.  I found that I was dropped into something called "Neuros Devboard", which was quite interesting.  I won't go into all the gory details, but suffice to say that the trick is that when it tries to boot the kernel, it does a CRC checksum and that checksum sometimes fails.  When it fails, you get dropped into the "Devboard" prompt (on the serial line).  From here, you can type "boot" to try the boot again.  And, what's really strange is, it sometimes works (even after it has failed - and without my changing anything or doing anything to get it to work again).

So, the basic fix is to just keep trying "boot" from the Devboard prompt until it works.  It usually only takes 2 or 3 retries.

I'd be most grateful if someone could suggest a more analytically correct way to fix this...
Logged
pfft2001
Sr. Member
****
Posts: 378



View Profile
« Reply #7 on: January 30, 2011, 09:17:13 am »

I'd also like to add that I think the general problem is that the booting is just too fragile (for this to be considered an end-user consumer product).

I noticed, when I was in the Devboard prompt, that there are various options there for booting from the network.  It'd be really nice if I could do that - I think it would be more reliable than relying on the onboard kernel.  I'd like to suggest that if anyone is doing more work on setting up boot environments for the OSD (and yes, I know you are doing this, Rick), that we focus on network booting.  I think that'd be really cool.

Incidentally, note that somewhere on the OSDng site, there is a reference to a way to setup the OSD to run a full-blown version of Debian via network booting.  So, it is clear that it can be done.
Logged
greyback
Administrator
Hero Member
*****
Posts: 1639


View Profile
« Reply #8 on: January 30, 2011, 06:53:59 pm »

Hey,
your standard boot is failing with a CRC error? Usually when a CRC error occurs and you run "printenv" you get a very different output to usual. That happen to you? From my time in helping out with support, this sort of error was super-rare. Perhaps the flash is getting a little knackered? I'd try a complete reflash with a UPK, that could help re-jitter those bits. Have you tried that?

DevBoard prompt is uboot. This is the bootloader, which like grub has a basic shell and a few commands. Netbooting can be set up here. Have a look at this wiki page on how to do that.

I myself used to have my OSD permanently set up to netboot. It was so that I could mess with the usually-read-only partition, and install software where it usually couldn't go. Also meant the CF-card slot was free. It worked fine.

Auto-update thingy is a pain, and I'm not sure if it can be turned off. I does tie into the same system as the recording scheduler. Have you looked at the SQLite database to see if anything non-recording related is mentioned there? Perhaps with a once-a-week entry...? Smiley

I'm sure you could manage to get Debian booting. I'd be surprised if a vanilla kernel would boot on the OSD, since the dm320 isn't a supported architecture. Neuros' kernel has added code for the board and all the drivers, and they're not in the mainline kernel sadly. But if you stick to Neuros' kernel, it should boot, and then you can pile on the packages!

Incidentally I've spent the last 2 days chopping these private drivers out of Neuros' kernel, and I've ported them up to 2.6.17.14. Porting is a big word, just the serial driver & mmc needed some work. The kernel builds, but I need to test (and am missing my serial cable, so will be tricky:( )

I chose 2.6.17.14 as it wouldn't be too much work, but it supports EABI. I *think* the current firmware uses OABI, which isn't as efficient. Certainly floating point arithmetic will get a speed-up with EABI. Then with EABI support, I can use a more modern toolchain and hopefully squeeze a little more performance from the OSD.

I intend to use the buildroot system, to quickly put together a nice base for a future OSD firmware. To add software, you just tick boxes! Most common stuff (Busybox, Dropbear, Qt, Sqlite) is ready to use and patched to work well. Saves re-inventing the wheel.

My only worry is that the closed binary modules will not work with my newer kernel. Oh, and my kernel is broken, that could be true too. I'll keep you guys informed.

pfft: good luck with the netboot!
-G
Logged
pfft2001
Sr. Member
****
Posts: 378



View Profile
« Reply #9 on: January 30, 2011, 07:52:34 pm »

Hey,
your standard boot is failing with a CRC error? Usually when a CRC error occurs and you run "printenv" you get a very different output to usual. That happen to you?

I haven't tried it - and I'm not likely to do so until the next time it crashes.  But I am familiar with the concepts behind the uboot (aka, Devboard) prompts - it looks a lot like the similar thing on Sun machines (Sparc, etc).  I've used it on Sun a lot.

But yeah, the key thing here is that when the boot crashes on the "Orange Circle", it is because the boot failed and the serial is sitting at the Devboard prompt.  I wish I had known that earlier - I just found it by accident today.  But as I said, it seems to correct itself if you just keep trying it until it works.  It seems to me that it is either "just random" (voodoo or sunspots) or there's something going on w.r.t. which kernel it is booting.  I think it might have something to do with when I do an Emergency Firmware Upgrade; doesn't that stash a copy of the new kernel somewhere else, and then redirect the booting process to use that one?

In any event, as I said, I was able to get it to boot by just retrying it until it works.


From my time in helping out with support, this sort of error was super-rare. Perhaps the flash is getting a little knackered? I'd try a complete reflash with a UPK, that could help re-jitter those bits. Have you tried that?

Possibly.  As I've noted, I've not been able to do an EFU successfully for a while now (Get message "Package Error").  I think that at least one of the root causes is that my SD card may just be defective.  I need to try it with a different card (at some point in the unspecified future).

Netbooting can be set up here. Have a look at this wiki page on how to do that.

Looks interesting.  I haven't had time to fully digest it yet.  But note that I already have a netboot (tftp) setup running on my network.  It serves up Linux and a few varieties of DOS.  I think it'd be pretty easy to add the OSD stuff to it.  But the key problem that I see is that, as I understand it, there is no real kernel file in the OSD.  It's not a file in the filesystem - it is stored somewhere mysterious, right?

Auto-update thingy is a pain, and I'm not sure if it can be turned off. I does tie into the same system as the recording scheduler. Have you looked at the SQLite database to see if anything non-recording related is mentioned there? Perhaps with a once-a-week entry...? Smiley

Yes.  I'm 82% sure that this entry in scheduler.sql is the one.  I was just waiting for some confirmation from you before going ahead and deleting it.  But who knows what side effects deleting might have???

sqlite> select * from main where id=1;
1|1296950400|0|1296950400|updater|weekly|
sqlite> .exit

I'm sure you could manage to get Debian booting. I'd be surprised if a vanilla kernel would boot on the OSD, since the dm320 isn't a supported architecture. Neuros' kernel has added code for the board and all the drivers, and they're not in the mainline kernel sadly. But if you stick to Neuros' kernel, it should boot, and then you can pile on the packages!

But how do I *get* the kernel???  As I said above, it doesn't seem to be a file anywhere!

My only worry is that the closed binary modules will not work with my newer kernel. Oh, and my kernel is broken, that could be true too. I'll keep you guys informed.

pfft: good luck with the netboot!
-G

Sounds interesting.  I'll let you know if I have any success with setting up netbooting.
Logged
greyback
Administrator
Hero Member
*****
Posts: 1639


View Profile
« Reply #10 on: January 30, 2011, 09:17:55 pm »

Netboot works like this:
- via TFTP it gets the kernel from your server. The kernel is wholly contained in a file called "uImage". Then uboot jumps to the start of uImage and executes the kernel.
- then once booted, the root filesystem is a NFS export from your server.

Kernel source code is in neuros-bsp/kernels/linux. The build process creates a "uImage" file somewhere (arch/arm/boot maybe)

So you're correct that the kernel file is a bit mysterious. It's not contained in the root file system, but exists in it's own little world.

Re. weekly thingy - I don't think deleting that entry will cause anything bad to happen. I say, go for it.
-G
Logged
pfft2001
Sr. Member
****
Posts: 378



View Profile
« Reply #11 on: January 30, 2011, 09:30:35 pm »

Netboot works like this:
- via TFTP it gets the kernel from your server. The kernel is wholly contained in a file called "uImage". Then uboot jumps to the start of uImage and executes the kernel.
- then once booted, the root filesystem is a NFS export from your server.

Can you suggest a way to get the uImage file (without going the whole route of downloading and building the source).  I'm really trying to avoid doing that (building from source).

Re. weekly thingy - I don't think deleting that entry will cause anything bad to happen. I say, go for it.
-G

I may just do that...
Logged
heyrick
Global Moderator
Sr. Member
*****
Posts: 340



View Profile WWW
« Reply #12 on: January 30, 2011, 09:38:26 pm »

I had another Debian briefly running, don't remember where/how, I'll see if I can find that SD card later on. I do remember it completely messed up the clock, so beware of side effects. I think the main problem here is anything we attempt to run must support the closed codecs, else not a lot of point really.

Odd you should be having such boot problems. Is it really that fragile, or is your box in a bad mood? That sort of behaviour would have me probing the power rails/supply. Why the fail then the work? Either it's broke or it ain't...

You can turn off the update, it is somewhere in the settings menu.

G - liking the idea of a better build system. I can't for the life of me figure out the current one, and given I'm using a slowish emulation, anything that can build without attempting to rebuild EVERYTHING can only be a good thing!

Re. performance, would this change make much difference? Most of the hard work takes place in the codecs/DSP. It seems to me that finding ways to trim Qt may make the biggest difference? But that would seem to be a mammoth job. What was Torfu based upon? Perhaps QtLite (or whatever it is called) is the step forward? How compatible is it with what we already have?


Best wishes,

Rick.
Logged
heyrick
Global Moderator
Sr. Member
*****
Posts: 340



View Profile WWW
« Reply #13 on: January 30, 2011, 09:44:16 pm »

Can you suggest a way to get the uImage file (without going the whole route of downloading and building the source).  I'm really trying to avoid doing that (building from source).

Offhand, I think your best bet would be to pull apart a UPK file. Looks easy enough, a little bit of code ought to split them.

If you're net booting, you could give my uImage a whirl? <wink> http://www.heyrick.co.uk/osd/


Best wishes,

Rick.
Logged
heyrick
Global Moderator
Sr. Member
*****
Posts: 340



View Profile WWW
« Reply #14 on: January 31, 2011, 08:48:47 am »

So, my first question is: How can I turn off this weekly check?  (I've been living with it for years now - and have found it annoying  - but after today's adventure, it is more than annoying).  I'm pretty sure I've seen a setting in the menus somewhere to turn it off - but I couldn't find it just now.  So, if someone could inform me where it is,  that'd be nice.

Main menu -> Settings -> Firmware upgrade -> set "Frequency" to "Off".

Ba-ding. Wink


Best wishes,

Rick.

[edit: typo, is "upgrade", not update...]
« Last Edit: January 31, 2011, 08:51:03 am by heyrick » Logged
Pages: [1] 2
Print
Jump to: