[Yaffs] YAFFS2 Development

Wed Dec 7 11:58:24 GMT 2005

On 12/6/05, Charles Manning <manningc2 at actrix.gen.nz> wrote:

Hiya Charles. I owe you a beer if I make it to LinuxConf (assuming you
are going).

> It is always great to get a bunch of feedback from someone doing something a
> bit different and stressing the system.

I figured.

> Some of the issues you raise have, I think, been dealt to in the last while,
> or have been recently tackled but there is still scope do use this info in a
> very useful way.

Yeah. Hence my emphasis on the vendor association. It's quite possible
much of this has gone away in the mainline version.

> I have some comments below. I would appreciate any further comment/patches as
> this will allow the code to be improved. I am not at all obsessed with the
> form it is provided in (C, patch, whatever). Someone that has been through
> the code as thoroughly as this will have a lot of useful perspective that I
> hope to be able to exploit :-).

Sure. I'm light on time and hardware for making lots of patches (this
hardware is for a customer and requires MVL3.1 as it stands) but I'll
hope to get ahold of some board which has a similar spec and can run a
recent 2.6 kernel asap. Anyway, meanwhile...

> On Wednesday 07 December 2005 05:10, Jon Masters wrote:
> <snip nice words>
>
> > * YAFFS2 should not use any shortopcaches when on Linux. It seems to
> >   make little difference to performance (only negative).
>
> That was my initial impression too. The shortop cache stuff was originally
> written for WinCE which does not work through a page cache like Linux does.
> In paticular nasty situations like
>
> //Yes, real WinCE programs really have code like this
>  while(!eof) {
>     read one byte
>  }

Ouch. I've always thought those guys were smoking something good and
now I actually have the proof.

> YAFFS was dead without some caching. By default it was turned off for Linux.
> Then somebody ran code of the form
>
>   for(i = 0; i < 100000; i++) {
>     write one byte
>   }
>
> The Linux cache is write-through so these calls were observed to be slow under
> Linux too and enabling the short op cache fixed the problem. From then on,
> the shortopcache has been enabled by default.

I agree with that. In the testing I did it was reading and writing
very large files sequentially (inline with the requirements) but I can
see the possible problem.

> 2) Perhaps only using the short op cache for write operations would be the
> best way to do things under Linux?

That would be the best thing to do. Otherwise we just waste time on
reads when the page cache will get populated by YAFFS2 anyway after a
readpage.

> > * YAFFS2 memory allocation using kmalloc does not work on very large
> >   devices and needs to use vmalloc instead in those cases (>2GB devices).
> >   The lack of checking for success proves to be a problem.

> I think this only impacts on the creation of the huge chunk bitmap structure.
> If so, this was dealt to in
> http://www.aleph1.co.uk/cgi-bin/viewcvs.cgi/yaffs2/yaffs_guts.c?r1=1.20&r2=1.21
> Andre tested this, IIRC, and this fixed the problem.

His hack would seem to fix that problem.

> Is more required?

I think YAFFS2 wants to decide how it is allocating memory. We have a
limit on vmalloc space too (though it's pretty big) so getting away
from unbounded allocations and having smaller buffers may become
necessary on very large devices.

> Yes, definitely the handling of alloc failures is a bit sloppy.

That is the main problem - you don't know things are failing until you
guess that's what is happening (reading comments along the lines of
"we should probably check if this fails" was helpful, I'll grant).

> > * YAFFS2 has various internal usage of types which makes it difficult to
> >   scale to >2GB devices. We have to divide up into multiple partitions.
>
> Can you give some details? I would like to fix this. There are some places
> where where ints are being used where off_t would be correct.

That sort of thing. I started doing wholesale replacements but YAFFS2
is corrupting kernel memory and causing untold troubles when devices
are over 2GB. There seem to be a few places that I missed and I didn't
have a continued brief to look at it - certainly I'd go through and
fix this use of ints (and typecasts).

> The chunkGroupBits issue also has impact on this.
> >
> > * Andre Renaud latched onto a problem which I then rediscovered in
> >   performance testing. Having chunk groups of 16 reduces performance by
> >   at least 50% but in practice can be much higher. By applying a version
> >   of his patch, I was able ot reduce read time for a 50MB file from 27
> >   seconds to around 15 seconds and have achieved sustained reads at
> >   22.2Mbit/s on multi-GB devices reading many hundred MBs.
>
> I have written some code (minor testing so far, more testing and checkin
> within 24 hours I hope) which should fix this.
>
> This code uses variable size bitmaps to fit the required bit width, thus
> eliminating chunkgroups, but does not use as much RAM as the Bluewater hack.

I saw your postings. I think that is a *much* better idea since it
will increase performance by 50-100% for some people. I combined that
hack with a couple of other fixes and a DMA enabled MTD to push
performance by over 200% of what it was when I started working on it.

> > * YAFFS2 makes use of some additional reads and memcpy's which don't
> >   seem entirely necessary - by combining and changing some of the logic
> >   it looks like we could get another 10% performance gain.

> Very much look forward to more info on this.

Ok. I'll look into that. There are several times where we call the MTD
read where once would do (with some extra logic) and a few memcpy's
where I think Linux could deal with a direct pointer instead (the MTD
layer should handle the cacheing issues and memory coherence problems
by doing any additional copies).

> The WinCE stuff has some extra copying (that is actually no longer required
> and will be eliminated). I hoped the Linux stuff was not doing too much exta
> work.

Not too much, but I took out one extra read (I'll track it down) and
got a speed bump of around 5-10% in one go. A few more of those (it's
worth someone sitting down and pouring over this code if there is
justification) and we we've got free extra speed. Certainly YAFFS2 is
now approaching the raw NAND performance when reading and writing
through /dev/mtd/blah and that is the goal.

Jon.