[Yaffs] YAFFS2 and kswapd dead lock problem

Lawson.Reed Reed.Lawson at IGT.com
Mon Dec 5 18:42:43 GMT 2005


> -----Original Message-----
> From: Charles Manning [mailto:manningc2 at actrix.gen.nz]
> Sent: Sunday, December 04, 2005 9:55 AM
> To: yaffs at stoneboat.aleph1.co.uk
> Cc: Lawson.Reed
> Subject: Re: [Yaffs] YAFFS2 and kswapd dead lock problem
> 
> 
> On Saturday 03 December 2005 07:56, Lawson.Reed wrote:
> > Found it.... took me 5 work days.
> > And this deadlock issue IS in the YAFFS2 tip code on CVS here:
> > 
> http://www.aleph1.co.uk/cgi-bin/viewcvs.cgi/yaffs2/yaffs_fs.c?
> rev=1.34&view
> >=auto
> >
> 
> Thanks for this huge effort Reed.
> 
> I am quite surprised that someone else has not stumbled on 
> this either.
> 
> Does it only impact on 2.4.x kernels or also 2.6?

>From looking at the inode.c in 2.6 I see they are still doing the 
wait_queue thing on the I_FREEING flag just the same as 2.4.
So, this deadlock issue should also be in 2.6 kernels. I do not
have a system running 2.6 so I can not verify this.

It only occurs under heavy multi thread load on the yaffs2 fs.
It is fairly infrequent and to troubleshoot it, I had to write
a torture test program.

> > So, no one has seen this???
> >
> > Here is what is happening:
> >
> > Process 'A' grabs the YAFFS2 grossLock.
> > Process 'B' preempts and it's job is to free unused inodes 
> everywhere.
> > (hint: 'B' is kswapd). So, 'B' sets I_FREEING. Then it calls
> > yaffs_clear_inode() which needs the grossLock. So, it goes
> > on the wait queue because 'A' has the grossLock.
> >
> > Now process 'A' runs. It's holding the grossLock. It calls
> > yaffs_get_inode() which calls BACK UP to iget()... With
> > the grossLock held! That calls find_inode(). It finds
> > I_FREEING set and then gets put on a wait queue in
> > __wait_on_freeing_inode().
> >
> > Presto chango deadlock.
> >
> > So, my solution is to make sure the grossLock is not held when
> > calling yaffs_get_inode(). Plus, I added grossLocking to
> > yaffs_read_inode() since NB's comment in there is no longer
> > true.
> >
> > I ran my 20 thread torture test which usually deadlocks in under 30
> > seconds. It ran overnight with this fix. The test found no compare
> > errors in the 20 files that it reads and writes at random times with
> > random data and random lengths.
> >
> > So, I strongly suggest that someone close to the YAFFSs 
> effort review
> > this change and incorporate it. I am kinda new to all this and I'm
> > not even sure what the correct way to submit the changes are.
> > So, let me know how I can help.
> 
> Did you mean to include some code?

No.

> The easiest is to send something in patch form using "diff 
> -Naur old new", but  others are fine too.

OK, how is this? Warning... I did not even compile this in the 
tip YAFFS2 code because I do not have a 2.6 tree. So, I copied my
changes from my 2.4 tree into the tip yaffs2/yaffs_fs.c code.
So, please review it carefully... 

I started with this:
http://www.aleph1.co.uk/cgi-bin/viewcvs.cgi/yaffs2/yaffs_fs.c?rev=1.34&view=auto
File: [Development] / yaffs2 / yaffs_fs.c (download) (as text) 
Revision: 1.34, Mon Nov 14 21:00:54 2005 UTC (2 weeks, 3 days ago) by charles

Thanks,
- Reed.

--------------------------------------------8<--------------------------------
diff -Naur yaffs2_old/yaffs_fs.c yaffs2/yaffs_fs.c
--- yaffs2_old/yaffs_fs.c       2005-12-05 09:53:10.000000000 -0800
+++ yaffs2/yaffs_fs.c   2005-12-05 10:15:32.000000000 -0800
@@ -310,6 +310,9 @@
  
        obj = yaffs_GetEquivalentObject(obj);   /* in case it was a hardlink */
  
+       // can not hold grossLock when calling yaffs_get_inode()
+       yaffs_GrossUnlock(dev);
+
        if (obj) {
                T(YAFFS_TRACE_OS,
                  (KERN_DEBUG "yaffs_lookup found %d\n", obj->objectId));
@@ -325,8 +328,6 @@
                        /*dget(dentry); // try to solve directory bug */
                        d_add(dentry, inode);
  
-                       yaffs_GrossUnlock(dev);
-
                        /* return dentry; */
                        return NULL;
 #endif
@@ -336,7 +337,6 @@
                T(YAFFS_TRACE_OS, (KERN_DEBUG "yaffs_lookup not found\n"));
  
        }
-       yaffs_GrossUnlock(dev);
  
 /* added NCB for 2.5/6 compatability - forces add even if inode is
  * NULL which creates dentry hash */
@@ -725,6 +725,8 @@
  
        /* NB Side effect: iget calls back to yaffs_read_inode(). */
        /* iget also increments the inode's i_count */
+       // RL AND you better not be holding the grossLock, or dead lock with wait_on_freeing_inode()
+       // will occur because of kswapd (and other things)!!
  
        return inode;
 }
@@ -941,6 +943,9 @@
                break;
        }
  
+       // can not call yaffs_get_inode() with grossLock held.
+       yaffs_GrossUnlock(dev);
+
        if (obj) {
                inode = yaffs_get_inode(dir->i_sb, mode, rdev, obj);
                d_instantiate(dentry, inode);
@@ -954,8 +959,6 @@
                error = -ENOMEM;
        }
  
-       yaffs_GrossUnlock(dev);
-
        return error;
 }
  
@@ -1238,6 +1241,7 @@
 {
        /* NB This is called as a side effect of other functions and
         * thus gross locking should always be in place already.
+         * RL Put the grossLock back in because of dead lock issue.
         */
  
        yaffs_Object *obj;
@@ -1246,10 +1250,14 @@
        T(YAFFS_TRACE_OS,
          (KERN_DEBUG "yaffs_read_inode for %d\n", (int)inode->i_ino));
  
+       yaffs_GrossLock(dev);
+
        obj = yaffs_FindObjectByNumber(dev, inode->i_ino);
  
        yaffs_FillInodeFromObject(inode, obj);
  
+       yaffs_GrossUnlock(dev);
+
 }
  
 static LIST_HEAD(yaffs_dev_list);
@@ -1478,13 +1486,13 @@
          ("yaffs_read_super: guts initialised %s\n",
           (err == YAFFS_OK) ? "OK" : "FAILED"));
  
+       // Can not call yaffs_get_inode() with the grossLock held.
+       yaffs_GrossUnlock(dev);
+
        /* Create root inode */
        if (err == YAFFS_OK)
                inode = yaffs_get_inode(sb, S_IFDIR | 0755, 0,
                                        yaffs_Root(dev));
-
-       yaffs_GrossUnlock(dev);
-
        if (!inode)
                return NULL;
  
--------------------------------------------8<--------------------------------
> 
> Thanx
> 
> Charles
> 



More information about the yaffs mailing list