September 24, 2017, 09:41:51 pm
News:
Pages: [1]
Print
Author Topic: Looking into MP4 repair  (Read 8921 times)
heyrick
Global Moderator
Sr. Member
*****
Posts: 340



View Profile WWW
« on: March 06, 2011, 05:43:05 pm »

** If you're looking for software to repair damaged MP4 files, you won't find it here! **


I am looking into the viability of having some way of being able to attempt to "repair" damaged MP4 files. What follows below are my notes, though it looks like it is ridiculously complicated.

For what it is worth, when my other PVR used to screw up finalising files, I tried a good few programs to recover MP4 data and I can say that exactly NONE of them worked. Hell, half of them didn't even seem to understand the file format. Things may have changed in a year, but honestly I don't expect much. The only thing I can be happy about is that in my time of owning the OSD, it only messed up finalising Zatoichi, and I was able to get the DVD from Amazon (used&new) for less than the cost of the postage. ;-)


Anyway, here's a rough guide to the murky insides of the mp4 file.


God almighty. It looks like there's a giant TOC at the end of the file, with pretty much nothing being pre-defined (only the "ftyp"), in the case of a failed write.


The structure of a valid file is:

  iso base

    '--- ftyp = "isom"
         This specifies the file type, plus a compatible type.
         Compatible "mp41" (MPEG 4 revision 1).
         It also seems as if "isom" is a media tag that is more
         intended for in-house work, and when "in the wild" it
         should specify the file type correctly (i.e. "mp41").

         It looks like QuickTime references ought to be the best
         source here; for the later "mp42" format is more in the
         realms of ISO and official standards ain't free...


  '----- mdat
         Gives size of movie data, plus the sample count, which
         would appear to be how many fragments there are. Also
         there is a non-parsed size (huh?).

         This may or may not be present? It looks like some data
         is present here in the non-finalised file, but the "mdat"
         ID and the word before are empty.


         Note - the *actual* data following resides elsewhere.


         '--- track (id #1)
         Defines the first track. Doesn't actually hold its own
         data, but instead is a sequence of chunks.
         In our test file, the first three chunks are at:
           +2756
           +141195
           +173453
         Note the chunk sizes are NOT regular.
         The first word of each chunk is "00 00 01 b6".


         '--- track (id #2)
         Defines the second track. It appears these ones are
         written before the video data, and they are smaller,
         so I think we can assume this is audio data.
         Test file, first three:
           +8
           +139383
           +171770
         Which if we calculate offsets from track one, would be:
           -2748
           -1812
           -1683
         So we cannot rely upon any offset relationship.
         The only thing that seems pretty much for certain is
         the first byte is &21, but this may not be true for
         long recordings.


  '----- moov
         Is apparently a "movie" atom as opposed to a "movie
         data" atom. I figure the difference is the data is
         the raw data (duh!) and this is the offset table.
         The type is "moov", and it is largish containing a long
         list of offsets. Example data:
           36
           2784
           139411
         which don't seem to correlate to the offsets above?


         '--- mvhd
         Movie header, gives length, speed, some sort of matrix...
         Of note, the tracks are listed backwards: 2,1


         '--- iods
         Unknown, "Initial Object Descriptor". Data is:
           00 00 00 21 69 6F 64 73 00 00 00 00 10 13 00 4F
           FF FF 0F 01 FF 0E 04 00 00 00 02 0E 04 00 00 00
           01
         What is this pointing to? Or meaning?

         For what it is worth, it is debatable if "mp41"
         should even have an "iods", as this is an "mp42"
         enhancement by the looks of things.


         '--- "trak" (id = 2)


              '--- "tkhd"
         Track header, seems to dupe a lot of the info in "mvhd"?


              '--- "edts"

                   '--- "elst"
          Edits, edit list? Seems to list media rate and segment
          duration.


              '--- "mdia"

                   
                   '--- "mdhd"
           Media header, gives duration (centiseconds?). It is
           of note that the "mvhd" gives a "timescale" of 90000
           while here the timescale is 1000. Additionally, the
           language seems to be undefined.


                   '--- "hdlr"
           Description, which is handler "soun", track type
           "Sound Track", name "Sound Media Handler".


                   '--- "minf"
           Media info.


                        '--- "smhd"
           Sound media handler.

                        '--- "dinf"
                             '--- "dref"
                                  '--- "url"
            Empty pointer to a URL, this is used my rtp and
            some streaming media.


                        '--- "stbl"
            Sample table


                             '--- "stsd"
            Sample table description.


                                  '--- "mp4a"
            Now we know it is MP4A, I guess we can assume AAC
            instead of MP3. Sample rate is 16000.0, sample size
            is 16, and 2 channels.


                                       '--- "esds"
            Extended descriptors - there are three extended
            descriptor tags, which are all 128. Oddly enough this
            is the bitrate. Coincidence? ;-)


                                  '--- "stts"
            Decoding time to sample. Gives sample count and
            delta, usu. 64 but may vary. This may be impossible
            to regenerate, so we might have to just push all to
            be "64" and accept that some random audio missyncs
            are better than a junk file.


                                  '--- "stsc"
            Appears to list all samples in order, specifying the
            description index as "1" and the samples per chunk
            as "2" for all of them.


                                  '--- "stsz"
            Lists the size of each sample. Good luck. ;-)
            For what it's worth, the first three are:
              2033, 715, 868.
            Yeah, again, another sequence of digits which are
            gibberish compared to previous data.


                                  '--- "stco"
            Sample chunk offset.
            36, 139411, 171798, which is every other (odd) entry
            in the "moov" list.


         '--- "trak" (id = 1) [NOTE 1 follows 2]


              '--- "tkhd"
         Track header, seems to dupe a lot of the info in "mvhd"?
         There is a "height" and a "width" which is specified as
         "480" and "640" respectively. This would correlate to the
         record dimensions (640x480).


              '--- "edts"

                   '--- "elst"
          Edits, edit list? Seems to list media rate and segment
          duration.


              '--- "mdia"

                   
                   '--- "mdhd"
           Media header, gives duration (centiseconds?). As for
           the audio track, the timescale is 1000.


                   '--- "hdlr"
           Description, which is handler "vide" (that means
           empty in French! ;-) ), track type "Video Track",
           name "Video Media Handler".


                   '--- "minf"
           Media info.


                        '--- "vmhd"
           Video media handler. Graphicsmode=0, opcolor=0,0,0


                        '--- "dinf"
                             '--- "dref"
                                  '--- "url"
            As before, empty URL pointer.


                        '--- "stbl"
            Sample table


                             '--- "stsd"
            Sample table description.


                                  '--- "mp4v"
            Now we will know is is MPEG-4 Simple Profile,
            H.263. Of note:
              compressorname = undefined
              depth = 24
              framecount = 1
              height = 480
              horizontal resolution = 72ppi
              vertical resolution = 72ppi
              width = 640


                                       '--- "esds"
            Extended descriptors - which are 59,0,0.
            Significance?


                                  '--- "stts"
            Decoding time to sample. Gives sample count and
            delta, usu. 40 but may vary (38-42).


                                  '--- "stsc"
            Appears to list all samples in order, specifying the
            description index as "1" and the samples per chunk
            as...
              5, 3, 3, 3, 3, 4, 3, 3, 3, 3, 4, 3, 3, 3, 3 [etc]
            I didn't see another 5 in the example file, so can
            we assume it is "4 3 3 3 3" repeated? 


                                  '--- "stsz"
            Lists the size of each sample. Good luck. ;-)
            For what it's worth, the first three are:
              37226 36283 42108 9187
            I give the 4th to show that we really can't make
            any assumptions.


                                  '--- "stco"
            Sample chunk offset.
            2784, 141223, 173481, which is every other (even)
            entry in the "moov" list.


                                  '--- "stss"
            Sync sample, the list is:
              1, 2, 3, 16, 31, 46, 61, 76, 91, 106, 121...
            ??? Keyframes?


I do not, as yet, know where in the file a lot of this stuff is actually located.



There are a few things in our favour. The OSD uses, primarily, one video codec (so-called "industry standard mpeg4" which is technically meaningless, what we mean is H.263SP/AVC, sort of XviD like, only in a different wrapper) and two audio codecs (MP3 and AAC).
As the writing library is the same in each version, it may be possible to calculate a lot of the restoration data by brute force by taking apart enough "osd.mp4" files to get an idea of *where* the first samples are written. At its most redundant level, once we step beyond the null header, the file consists of a series of "atoms" each of which should carry a size. If this is true, then we may be able to recover data. If this is not true, we might be able to restore something by low-level searching for the marker that video chunks appear to begin with, and as the data is interleaved, we could infer where the audio chunks are. Again, this is all pretty hairy, but as we'd only be dealing with the OSD's output...

We would need user assistance for:
  Dimensions
  Audio codec and bitrate

Yeah, and, um, exactly where is the fps specified?



Right, so here is a rough overview of the file at low level:

It begins:

  00 00 00 01 f  t  y  p  i  s  o  m  00 00 00 00
  m  p  4  1  00 00 00 00 00 00 00 00 xx xx xx xx
  m  d  a  t  21 5F FE 80 00 64 0F C8 64 0F C8 DF


The xx xx xx xx is a pointer to the "moov" element. It seems to be a couple of words early, so it might be "offset from <X>"
rather than from the start of the file.
This and the following "mdat" are not present in the non-finalised file.
Byte order is 00 59 1E 88 is the address &00591E88 (not &881E5900 or anything weirder).

The 215F....C8DF seems to be the same? Purpose? It is the first chunk of track #2, so perhaps it is the MP4A header or somesuch?

This claims to be at offset +8, but it is really at address &24 in the file. Prior is the "mdat" and its pointer.

Following the pointer, we reach the end of the data and end up (after advancing the equal seven words - why can't
this stuff start from address zero?) to the word before the "moov". It is 00 00 3F 58, which is 16216, which is the atom length.

And so for the TOC. Are the "mdat" track/chunk IDs built from the media description, or is it possible to step through the file to calculate the chunks?


So many questions... But one thing is for certain, any attempt to cut'n'paste header info will most likely result in plenty of failure. I might (*no* promises) play around with a damaged file to see if I can see what the bare minimum necessary to get a playable file actually is.


Okay, that's all for now.

Best wishes,

Rick.
Logged
heyrick
Global Moderator
Sr. Member
*****
Posts: 340



View Profile WWW
« Reply #1 on: April 18, 2011, 05:40:16 am »

I've done some more thinking here, and have arrived at the conclusion that anything that does such a task is extremely unlikely to be satisfactory. Could explain why there is no software capable of doing this (quite a few claim to, but I've found they don't work).

Given it is chunk-based, it would not be impossible to step though the file to detect the chunks and make a note of them. As we are dealing with OSD files, we can make a number of assumptions regarding the layout of data in the file and the codecs used. But if you read the above, there is a LOT of metadata that could not be reproduced.

Damn annoying this closed-source codec. It would be nice to paste in a data block, say prior to every keyframe, that contained the TOC information for the list little bit so that an MP4 file could be rebuilt if damaged/incomplete... But then again, it is as much a flaw in the MP4 spec as it is the codec being closed source.

<sigh>


Best wishes,

Rick.
Logged
greyback
Administrator
Hero Member
*****
Posts: 1639


View Profile
« Reply #2 on: April 19, 2011, 01:19:21 pm »

Hey,
this does look like a very difficult project. Are you aware that temporary files are created by the recording process to store some data? Offhand I think they're in /media/ext/data I think they contain finalisation data, perhaps something left there is useful?
-G
Logged
heyrick
Global Moderator
Sr. Member
*****
Posts: 340



View Profile WWW
« Reply #3 on: April 19, 2011, 01:42:50 pm »

Hi,

That is useful to know. I'll have to set up a recording, grab a copy of the temp file and see if the contents bear relation to the file toc.
Even so, the chance of a file repair is minimal. .mp4 is a difficult format, but what d'you expect for something that began life as Quicktime video? Wink
Logged
greyback
Administrator
Hero Member
*****
Posts: 1639


View Profile
« Reply #4 on: April 19, 2011, 01:59:03 pm »

Yep, it's not easy going. However I would have thought that finalisation was not such a complex procedure, just writing a correct header to the file.

Recalling the stream_fuse project, it was able to write a temporary header to the frames and stream it straight away.
-G
Logged
heyrick
Global Moderator
Sr. Member
*****
Posts: 340



View Profile WWW
« Reply #5 on: April 19, 2011, 05:37:56 pm »

just writing a correct header to the file.

The finalisation data is a lot more than "a correct header", but you have me wondering how much is really necessary.

Still, thinking most about the JTAG gubbins right now...


Quote
Recalling the stream_fuse project, it was able to write a temporary header to the frames and stream it straight away.

Whoa, wait... So I could set the satellite to NHK World, the OSD would "record" this and push it through the LAN so I could watch it in near-realtime on my netbook?
Okay, now you've got my interest! Tell me more...


Best wishes,

Rick.
Logged
greyback
Administrator
Hero Member
*****
Posts: 1639


View Profile
« Reply #6 on: April 19, 2011, 06:26:58 pm »

Whoa, wait... So I could set the satellite to NHK World, the OSD would "record" this and push it through the LAN so I could watch it in near-realtime on my netbook?
Yep. I had it working at one stage.

Quote
Okay, now you've got my interest! Tell me more...
I think the binary "stream_fuse" is built into most recent firmware (not sure if that includes OSDng), but I don't believe it's exposed in the UI. Check out
http://wiki.neurostechnology.com/index.php/StreamFuse
for more info. I tracked down the source code here:
http://matthewwild.co.uk/uploads/stream-fuse-1.0.0.tar.gz
and huge credit must go to "mgschwan" for writing it. Matthew Wild must also have contributed, but I'm not sure of that.
-G
Logged
heyrick
Global Moderator
Sr. Member
*****
Posts: 340



View Profile WWW
« Reply #7 on: April 19, 2011, 09:32:24 pm »

Thanks for the links. Doesn't look too problematic (famous last words?!), and it's nice to know that SMPlayer ought to cope with it okay (I only use VLC for playing fansubs in weird formats nothing else will touch).

Hell, it's only quarter past three (in the morning). I'll try it now. Wink


Hmm... Won't work. I get the first frame, followed by nothing. Here's the MPlayer log:
D:\Program Files\SMPlayer\mplayer>mplayer http://192.168.0.12:10001/3 -cache 200
0
MPlayer Sherpya-SVN-r30369-4.2.5 (C) 2000-2009 MPlayer Team

Playing http://192.168.0.12:10001/3.
Connecting to server 192.168.0.12[192.168.0.12]: 10001...
Cache size set to 2000 KBytes
Cache fill: 19.60% (401408 bytes)
ASF file format detected.
[asfheader] Video stream found, -vid 2
[asfheader] Audio stream found, -aid 1
VIDEO:  [M4S2]  352x288  24bpp  1000.000 fps    0.0 kbps ( 0.0 kbyte/s)
==========================================================================
Opening video decoder: [ffmpeg] FFmpeg's libavcodec codec family
Selected video codec: [ffodivx] vfm: ffmpeg (FFmpeg MPEG-4)
==========================================================================
==========================================================================
Opening audio decoder: [ffmpeg] FFmpeg/libavcodec audio decoders
[mp3 @ 0107B8F4]incomplete frame
[mp3 @ 0107B8F4]Header missing
[mp3 @ 0107B8F4]Header missing
[mp3 @ 0107B8F4]Header missing
[mp3 @ 0107B8F4]Header missing
[mp3 @ 0107B8F4]Header missing
AUDIO: 44100 Hz, 2 ch, s16le, 96.0 kbit/6.80% (ratio: 12000->176400)
Selected audio codec: [ffmp3] afm: ffmpeg (FFmpeg MPEG layer-3 audio)
==========================================================================
AO: [dsound] 44100Hz 2ch s16le (2 bytes per sample)
Starting playback...
[mp3 @ 0107B8F4]Header missing
Movie-Aspect is 1.22:1 - prescaling to correct movie aspect.
VO: [directx] 352x288 => 352x288 Planar YV12
[mp3 @ 0107B8F4]Header missingct:  0.000   2/  2 ??% ??% ??,?% 0 0 22%
[mp3 @ 0107B8F4]Header missingct:  0.004   3/  3 ??% ??% ??,?% 0 0 22%
[mp3 @ 0107B8F4]Header missingct:  0.004   4/  4 ??% ??% ??,?% 0 0 21%
[mp3 @ 0107B8F4]Header missingct:  0.004   4/  4 ??% ??% ??,?% 0 0 21%
[mp3 @ 0107B8F4]Header missingct:  0.004   4/  4 ??% ??% ??,?% 0 0 21%
[mp3 @ 0107B8F4]Header missingct:  0.004   4/  4 ??% ??% ??,?% 0 0 21%
[mp3 @ 0107B8F4]Header missingct:  0.004   4/  4 ??% ??% ??,?% 0 0 21%
[mp3 @ 0107B8F4]Header missingct:  0.004   4/  4 ??% ??% ??,?% 0 0 20%
[mp3 @ 0107B8F4]Header missingct:  0.004   4/  4 ??% ??% ??,?% 0 0 20%
[mp3 @ 0107B8F4]Header missingct:  0.004   4/  4 ??% ??% ??,?% 0 0 21%
[mp3 @ 0107B8F4]Header missingct:  0.004   4/  4 ??% ??% ??,?% 0 0 21%
[mp3 @ 0107B8F4]Header missingct:  0.004   4/  4 ??% ??% ??,?% 0 0 21%
[mp3 @ 0107B8F4]Header missingct:  0.004   4/  4 ??% ??% ??,?% 0 0 21%

I break out of MPlayer at this point...


Mmm, VLC plays it. Just watched the news on NHK World. I find the high bitrate method (7?) is not useful. Even running a 100mbit ethernet with no other traffic, it can't keep up with the data stream.

This works quite nicely: http://192.168.0.12:10001/4  <-- quality preset option '4'

The /manual option doesn't appear to work.

The only thing that you should watch out for is that the whole system is extremely crashy - you can often find the record process has frozen. In getting this far, I've rebooted my OSD numerous times. I've just stopped watching, so I'll reboot again, just to be sure the record process isn't lingering.

Still, it is an interesting project. A shame it wasn't continued.


Best wishes,

Rick.
Logged
pfft2001
Sr. Member
****
Posts: 378



View Profile
« Reply #8 on: April 24, 2011, 01:14:51 pm »

I don't have much to contribute on the subject of MP4 file repair, but I think this post may be somewhat related to the underlying underlying of this thread.

That is, on the general idea that "an ounce of prevention is worth a pound of cure", the question could be rephrased as "What causes the OSD to produce broken MP4 files?", which leads to my point which is: "Too much uptime".  Recent experience suggests that the OSD can get into a state where it produces crappy files as output (a variety of symptoms, including bad audio/video sync, only one audio channel, frozen picture, etc) --> and rebooting the box fixes the problem.  This suggests that it may be a good idea to periodically reboot the OSD [*], but I see nothing in the documentation to that effect?

Does Neuros have an official word on this topic?

[*] But be aware (and beware!) of the fact that the scheduler usually doesn't come up correctly after a reboot - it is necessary to do a "sync" operation (manually: by going into "View Scheduled Recordings" and re-saving an entry - this topic has been discussed at length previously on this board) in order to get the scheduler running again after a crash, er, I mean, reboot.
Logged
Pages: [1]
Print
Jump to: