Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!logbridge.uoregon.edu!news.net.uni-c.dk!uninett.no!uio.no!nntp.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
From: Rusty Russell <ru...@rustcorp.com.au>
To: torva...@transmeta.com
cc: linux-ker...@vger.kernel.org
Subject: AUDIT: copy_from_user is a deathtrap.
Original-Date: Fri, 17 May 2002 19:27:54 +1000
Original-Message-Id: <E178e1l-0007qB-00@wagner.rustcorp.com.au>
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Fri, 17 May 2002 09:26:34 GMT
Message-ID: <fa.je269uv.r04ja8@ifi.uio.no>
Lines: 37
Linus,
Should I change copy_to/from_user to return -EFAULT, or
introduce a new copy_to/from_uspace which does and start moving
everything across?
There are 5,500 uses of copy_to/from_user in 2.5.15. 52 of them use
the return value in a way which would be broken by it returning
-EFAULT. 51 of those don't need to (mainly cut & paste between serial
drivers).
/* Returns amount which wasn't copied before EFAULT. Used by mount. */
static inline unsigned long
gradual_copy_from_user(void *to, const void *from, unsigned long n)
{
unsigned long i;
for (i = 0; i < n; i++, to++, from++) {
if (copy_from_user(from, to, 1) != 0)
break;
}
return n - i;
}
There are 415 uses of copy_to/from_user which are wrong, despite an
audit 12 months ago by the Stanford checker.
Tired of auditing the same bugs,
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!logbridge.uoregon.edu!sfo2-feed1.news.algx.net!allegiance!news-hog.berkeley.edu!ucberkeley!news-feed.ifi.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
Original-Date: Fri, 17 May 2002 02:21:48 -0700 (PDT)
Original-Message-Id: <20020517.022148.48851839.davem@redhat.com>
To: ru...@rustcorp.com.au
Cc: torva...@transmeta.com, linux-ker...@vger.kernel.org
Subject: Re: AUDIT: copy_from_user is a deathtrap.
From: "David S. Miller" <da...@redhat.com>
In-Reply-To: <E178e1l-0007qB-00@wagner.rustcorp.com.au>
Original-References: <E178e1l-0007qB...@wagner.rustcorp.com.au>
X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI)
Mime-Version: 1.0
Content-Type: Text/Plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Fri, 17 May 2002 09:36:41 GMT
Message-ID: <fa.hrhe4cv.1ugap8u@ifi.uio.no>
References: <fa.je269uv.r04ja8@ifi.uio.no>
Lines: 13
From: Rusty Russell <ru...@rustcorp.com.au>
Date: Fri, 17 May 2002 19:27:54 +1000
There are 415 uses of copy_to/from_user which are wrong, despite an
audit 12 months ago by the Stanford checker.
I would much rather fix these instances than add yet another
interface.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!howland.erols.net!news.net.uni-c.dk!uninett.no!uio.no!nntp.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
From: Rusty Russell <ru...@rustcorp.com.au>
To: "David S. Miller" <da...@redhat.com>
Cc: torva...@transmeta.com, linux-ker...@vger.kernel.org
Subject: Re: AUDIT: copy_from_user is a deathtrap.
In-reply-to: Your message of "Fri, 17 May 2002 02:21:48 MST."
<20020517.022148.48851839.davem@redhat.com>
Original-Date: Fri, 17 May 2002 19:49:40 +1000
Original-Message-Id: <E178eMm-0000NO-00@wagner.rustcorp.com.au>
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Fri, 17 May 2002 09:47:49 GMT
Message-ID: <fa.iskoa6v.4n6j2f@ifi.uio.no>
References: <fa.hrhe4cv.1ugap8u@ifi.uio.no>
Lines: 22
In message <20020517.022148.48851839.da...@redhat.com> you write:
> From: Rusty Russell <ru...@rustcorp.com.au>
> Date: Fri, 17 May 2002 19:27:54 +1000
>
> There are 415 uses of copy_to/from_user which are wrong, despite an
> audit 12 months ago by the Stanford checker.
>
> I would much rather fix these instances than add yet another
> interface.
I'll accept that if someone's volunteering to audit the kernel for
them every six months.
Sorry I wasn't clear: I'm saying *replace*, not add,
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news2.google.com!news1.google.com!newsfeed.stanford.edu!cyclone.bc.net!news-hog.berkeley.edu!ucberkeley!news-feed.ifi.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
Subject: Re: AUDIT: copy_from_user is a deathtrap.
To: ru...@rustcorp.com.au (Rusty Russell)
Original-Date: Fri, 17 May 2002 13:17:25 +0100 (BST)
Cc: da...@redhat.com (David S. Miller), torva...@transmeta.com,
linux-ker...@vger.kernel.org
In-Reply-To: <E178eMm-0000NO-00@wagner.rustcorp.com.au> from "Rusty Russell" at May 17, 2002 07:49:40 PM
X-Mailer: ELM [version 2.5 PL6]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Original-Message-Id: <E178gfl-0006Ip-00@the-village.bc.nu>
From: Alan Cox <a...@lxorguk.ukuu.org.uk>
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Fri, 17 May 2002 12:27:42 GMT
Message-ID: <fa.gk4v6ev.1j5eo2h@ifi.uio.no>
References: <fa.iskoa6v.4n6j2f@ifi.uio.no>
Lines: 26
> > I would much rather fix these instances than add yet another
> > interface.
>
> I'll accept that if someone's volunteering to audit the kernel for
> them every six months.
>
> Sorry I wasn't clear: I'm saying *replace*, not add,
Replace requires you audit every single use, and then work out how to
handle those that do care about the length and the point it faulted. From
what I've seen of the stuff that has been fixed we have a mix of the
following
1. Misports of ancient verify_* code - eg the serial ones
2. Not checking the return code - 100% legal and standards compliant
I've seen very few that have other screwups. In fact I've seen far more
incorrect uses of kmalloc with a user passed input field, kmalloc with
maths overflows, copy*user with maths overflows and the like
Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!logbridge.uoregon.edu!news.net.uni-c.dk!uninett.no!uio.no!nntp.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
From: Rusty Russell <ru...@rustcorp.com.au>
To: Alan Cox <a...@lxorguk.ukuu.org.uk>
Cc: da...@redhat.com (David S. Miller), torva...@transmeta.com,
linux-ker...@vger.kernel.org
Subject: Re: AUDIT: copy_from_user is a deathtrap.
In-reply-to: Your message of "Fri, 17 May 2002 13:17:25 +0100."
<E178gfl-0006Ip-00@the-village.bc.nu>
Original-Date: Fri, 17 May 2002 22:21:45 +1000
Original-Message-Id: <E178gkH-0001LV-00@wagner.rustcorp.com.au>
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Fri, 17 May 2002 12:20:59 GMT
Message-ID: <fa.isn20uv.4koqae@ifi.uio.no>
References: <fa.gk4v6ev.1j5eo2h@ifi.uio.no>
Lines: 33
In message <E178gfl-0006Ip...@the-village.bc.nu> you write:
> > > I would much rather fix these instances than add yet another
> > > interface.
> >
> > I'll accept that if someone's volunteering to audit the kernel for
> > them every six months.
> >
> > Sorry I wasn't clear: I'm saying *replace*, not add,
>
> Replace requires you audit every single use, and then work out how to
> handle those that do care about the length and the point it faulted.
Read my original post. I have done this.
> From what I've seen of the stuff that has been fixed we have a mix
> of the following
>
> 1. Misports of ancient verify_* code - eg the serial ones
> 2. Not checking the return code - 100% legal and standards compliant
No, the 400+ are all of form:
/* of course this returns 0 or -EFAULT! */
return copy_from_user(xxx);
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!headwall.stanford.edu!newsfeed.utk.edu!news-hog.berkeley.edu!ucberkeley!news-feed.ifi.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
Subject: Re: AUDIT: copy_from_user is a deathtrap.
To: ru...@rustcorp.com.au (Rusty Russell)
Original-Date: Fri, 17 May 2002 13:58:53 +0100 (BST)
Cc: a...@lxorguk.ukuu.org.uk (Alan Cox), da...@redhat.com (David S. Miller),
torva...@transmeta.com, linux-ker...@vger.kernel.org
In-Reply-To: <E178gkH-0001LV-00@wagner.rustcorp.com.au> from "Rusty Russell" at May 17, 2002 10:21:45 PM
X-Mailer: ELM [version 2.5 PL6]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Original-Message-Id: <E178hJt-0006Rb-00@the-village.bc.nu>
From: Alan Cox <a...@lxorguk.ukuu.org.uk>
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Fri, 17 May 2002 13:25:20 GMT
Message-ID: <fa.gp2b8ev.1p6iu2h@ifi.uio.no>
References: <fa.isn20uv.4koqae@ifi.uio.no>
Lines: 11
> No, the 400+ are all of form:
>
> /* of course this returns 0 or -EFAULT! */
> return copy_from_user(xxx);
So lets verify and fix them. Post the list to the kenrel janitors
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!logbridge.uoregon.edu!news.net.uni-c.dk!uninett.no!uio.no!nntp.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
From: Rusty Russell <ru...@rustcorp.com.au>
To: Alan Cox <a...@lxorguk.ukuu.org.uk>
Cc: linux-ker...@vger.kernel.org, torva...@transmeta.com
Subject: Re: AUDIT: copy_from_user is a deathtrap.
In-reply-to: Your message of "Fri, 17 May 2002 13:58:53 +0100."
<E178hJt-0006Rb-00@the-village.bc.nu>
Original-Date: Fri, 17 May 2002 22:58:08 +1000
Original-Message-Id: <E178hJU-0002GS-00@wagner.rustcorp.com.au>
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Fri, 17 May 2002 12:57:20 GMT
Message-ID: <fa.iqkq46v.6mgt2d@ifi.uio.no>
References: <fa.gp2b8ev.1p6iu2h@ifi.uio.no>
Lines: 23
In message <E178hJt-0006Rb...@the-village.bc.nu> you write:
> > No, the 400+ are all of form:
> >
> > /* of course this returns 0 or -EFAULT! */
> > return copy_from_user(xxx);
>
> So lets verify and fix them. Post the list to the kenrel janitors
Again, like we did 12 months ago you mean?
We could do that, or, we could fix the actual problem, which is the
HUGE FUCKING BEARTRAP WHICH CATCHES EVERY SINGLE NEW PROGRAMMER ON THE
WAY THROUGH.
Not fixing earlier was criminal, not fixing it today is insane.
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!news.tele.dk!small.news.tele.dk!130.240.42.8!luth.se!newsfeed1.uni2.dk!news.net.uni-c.dk!uninett.no!uio.no!nntp.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
Subject: Re: AUDIT: copy_from_user is a deathtrap.
To: ru...@rustcorp.com.au (Rusty Russell)
Original-Date: Fri, 17 May 2002 15:52:20 +0100 (BST)
Cc: a...@lxorguk.ukuu.org.uk (Alan Cox), linux-ker...@vger.kernel.org,
torva...@transmeta.com
In-Reply-To: <E178hJU-0002GS-00@wagner.rustcorp.com.au> from "Rusty Russell" at May 17, 2002 10:58:08 PM
X-Mailer: ELM [version 2.5 PL6]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Original-Message-Id: <E178j5g-0006en-00@the-village.bc.nu>
From: Alan Cox <a...@lxorguk.ukuu.org.uk>
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Fri, 17 May 2002 14:34:29 GMT
Message-ID: <fa.h3hp56v.13hkqqh@ifi.uio.no>
References: <fa.iqkq46v.6mgt2d@ifi.uio.no>
Lines: 19
> Again, like we did 12 months ago you mean?
We didnt fix them 12 months ago
> We could do that, or, we could fix the actual problem, which is the
> HUGE FUCKING BEARTRAP WHICH CATCHES EVERY SINGLE NEW PROGRAMMER ON THE
> WAY THROUGH.
Capital letters versus content. I'd prefer content
All the cases I looked at where replications of existing bugs copied from
old drivers. That doesn't say copy_*_user is wrong, it says lots of examples
people keep using are wrong
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!cyclone.bc.net!news-hog.berkeley.edu!ucberkeley!news-feed.ifi.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
From: Rusty Russell <ru...@rustcorp.com.au>
To: Alan Cox <a...@lxorguk.ukuu.org.uk>
Cc: linux-ker...@vger.kernel.org, torva...@transmeta.com
Subject: Re: AUDIT: copy_from_user is a deathtrap.
In-reply-to: Your message of "Fri, 17 May 2002 15:52:20 +0100."
<E178j5g-0006en-00@the-village.bc.nu>
Original-Date: Sat, 18 May 2002 11:26:48 +1000
Original-Message-Id: <E178t00-0006e2-00@wagner.rustcorp.com.au>
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Sat, 18 May 2002 01:24:38 GMT
Message-ID: <fa.jfh3quv.pn64a9@ifi.uio.no>
References: <fa.h3hp56v.13hkqqh@ifi.uio.no>
Lines: 42
In message <E178j5g-0006en...@the-village.bc.nu> you write:
> > We could do that, or, we could fix the actual problem, which is the
> > HUGE FUCKING BEARTRAP WHICH CATCHES EVERY SINGLE NEW PROGRAMMER ON THE
> > WAY THROUGH.
>
> Capital letters versus content. I'd prefer content
1) Returning 0 on success, and -errno on error is a common kernel
convention.
2) Following kernel conventions makes it easier for other programmers
to use your code.
3) You should only violate kernel conventions when there is a
compelling reason.
1a) If you're going to break a convention, do it in a way that
breaks compile, or
1b) If you can't do that, make it reliably break at runtime.
4) The single case which requires this information can be fixed by a
simple 10-line wrapper function.
I do not believe this is a compelling reason to violate kernel
convention in a way which is almost impossible to notice. I furthur
believe that it speaks very poorly about the thought put into kernel
interface design.
> All the cases I looked at where replications of existing bugs copied from
> old drivers.
Try looking at intermezzo, or the s390 and s390x ports. New code, new
coders, same trap.
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!cyclone.bc.net!news-hog.berkeley.edu!ucberkeley!news-feed.ifi.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
Original-Date: Fri, 17 May 2002 19:37:21 -0700 (PDT)
From: Linus Torvalds <torva...@transmeta.com>
To: Rusty Russell <ru...@rustcorp.com.au>
cc: "David S. Miller" <da...@redhat.com>, <linux-ker...@vger.kernel.org>
Subject: Re: AUDIT: copy_from_user is a deathtrap.
In-Reply-To: <E178eMm-0000NO-00@wagner.rustcorp.com.au>
Original-Message-ID: <Pine.LNX.4.44.0205171936220.1524-100000@home.transmeta.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Sat, 18 May 2002 02:39:02 GMT
Message-ID: <fa.m6ekdiv.14g698t@ifi.uio.no>
References: <fa.iskoa6v.4n6j2f@ifi.uio.no>
Lines: 18
On Fri, 17 May 2002, Rusty Russell wrote:
>
> Sorry I wasn't clear: I'm saying *replace*, not add,
Ok, let _me_ be clear: replacing them with an inferior product that cannot
tell you partial copies is not going to happen. Not now, not ever. You
would break all the users who _require_ knowing about a read() that only
gave you 5 out of 50 bytes.
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!cyclone.bc.net!news-hog.berkeley.edu!ucberkeley!news-feed.ifi.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
To: Linus Torvalds <torva...@transmeta.com>
Cc: linux-ker...@vger.kernel.org, ru...@rustcorp.com.au,
a...@lxorguk.ukuu.org.uk
Subject: Re: AUDIT: copy_from_user is a deathtrap.
Original-References: <E178eMm-0000NO...@wagner.rustcorp.com.au.suse.lists.linux.kernel> <Pine.LNX.4.44.0205171936220.1524-100...@home.transmeta.com.suse.lists.linux.kernel>
From: Andi Kleen <a...@suse.de>
Original-Date: 18 May 2002 12:16:40 +0200
In-Reply-To: Linus Torvalds's message of "18 May 2002 04:41:47 +0200"
Original-Message-ID: <p733cwpzrp3.fsf@oldwotan.suse.de>
Lines: 26
X-Mailer: Gnus v5.7/Emacs 20.6
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Sat, 18 May 2002 10:17:27 GMT
Message-ID: <fa.ibo2vav.5ki0rq@ifi.uio.no>
References: <fa.10gjp53v.11kmuot@ifi.uio.no>
Linus Torvalds <torva...@transmeta.com> writes:
> On Fri, 17 May 2002, Rusty Russell wrote:
> >
> > Sorry I wasn't clear: I'm saying *replace*, not add,
>
> Ok, let _me_ be clear: replacing them with an inferior product that cannot
> tell you partial copies is not going to happen. Not now, not ever. You
> would break all the users who _require_ knowing about a read() that only
> gave you 5 out of 50 bytes.
Are you sure they even exist ? As far as I can see near everybody relies
on zeroing of target on exception instead.
At least for the SSE optimized copy_*_user always would be much better,
because optimizing the miss count is painful from an unrolled loop
and cannot be even done accurately (8 bytes accuracy is best with 8 byte
loads/stored). With that in mind I think the byte count is broken by
design because it cannot be correctly implemented unless you do byte copies.
I remember TCP was given as the prime user when this interface was
introduced in 2.1, but TCP does not use the byte count currently and never has
(in fact the primary memory copy interface of TCP - csum_copy_* - does not
even support it)
-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!news-hog.berkeley.edu!ucberkeley!news-feed.ifi.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
Original-Date: Sat, 18 May 2002 09:14:10 -0700 (PDT)
From: Linus Torvalds <torva...@transmeta.com>
To: Andi Kleen <a...@suse.de>
cc: linux-ker...@vger.kernel.org, <ru...@rustcorp.com.au>,
<a...@lxorguk.ukuu.org.uk>
Subject: Re: AUDIT: copy_from_user is a deathtrap.
In-Reply-To: <p733cwpzrp3.fsf@oldwotan.suse.de>
Original-Message-ID: <Pine.LNX.4.44.0205180910570.26742-100000@home.transmeta.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Sat, 18 May 2002 16:15:47 GMT
Message-ID: <fa.l3tlo9v.1m1q3hi@ifi.uio.no>
References: <fa.ibo2vav.5ki0rq@ifi.uio.no>
Lines: 15
On 18 May 2002, Andi Kleen wrote:
>
> Are you sure they even exist ?
Oh, like read() or write() for regular files? Yup, they exist.
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!cyclone.bc.net!news-hog.berkeley.edu!ucberkeley!news-feed.ifi.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
From: Rusty Russell <ru...@rustcorp.com.au>
To: Linus Torvalds <torva...@transmeta.com>
Cc: linux-ker...@vger.kernel.org, ru...@rustcorp.com.au,
a...@lxorguk.ukuu.org.uk
Subject: Re: AUDIT: copy_from_user is a deathtrap.
In-reply-to: Your message of "Sat, 18 May 2002 09:14:10 MST."
<Pine.LNX.4.44.0205180910570.26742-100000@home.transmeta.com>
Original-Date: Sun, 19 May 2002 12:10:38 +1000
Original-Message-Id: <E179GA4-0004ZT-00@wagner.rustcorp.com.au>
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Sun, 19 May 2002 02:09:10 GMT
Message-ID: <fa.ijk9ruv.vm85aa@ifi.uio.no>
References: <fa.l3tlo9v.1m1q3hi@ifi.uio.no>
Lines: 27
In message <Pine.LNX.4.44.0205180910570.26742-100...@home.transmeta.com> you wr
ite:
>
>
> On 18 May 2002, Andi Kleen wrote:
> >
> > Are you sure they even exist ?
>
> Oh, like read() or write() for regular files? Yup, they exist.
Huh? No, you ask for 2000 bytes into a buffer that can only take 1000
bytes without hitting an unmapped page, returning EFAULT or giving a
SIGSEGV is perfectly acceptable.
As a coder, I'd *really* prefer that to hiding the bug!
There's only one case which really actually cares for a valid reason:
the hack in copying mount options.
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!logbridge.uoregon.edu!news.net.uni-c.dk!uninett.no!uio.no!nntp.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
Original-Date: Sat, 18 May 2002 20:01:48 -0700 (PDT)
From: Linus Torvalds <torva...@transmeta.com>
To: Rusty Russell <ru...@rustcorp.com.au>
cc: linux-ker...@vger.kernel.org, <a...@lxorguk.ukuu.org.uk>
Subject: Re: AUDIT: copy_from_user is a deathtrap.
In-Reply-To: <E179GA4-0004ZT-00@wagner.rustcorp.com.au>
Original-Message-ID: <Pine.LNX.4.44.0205181958140.30454-100000@home.transmeta.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Sun, 19 May 2002 03:02:53 GMT
Message-ID: <fa.l3t3oiv.1n1g29r@ifi.uio.no>
References: <fa.ijk9ruv.vm85aa@ifi.uio.no>
Lines: 31
On Sun, 19 May 2002, Rusty Russell wrote:
>
> Huh? No, you ask for 2000 bytes into a buffer that can only take 1000
> bytes without hitting an unmapped page, returning EFAULT or giving a
> SIGSEGV is perfectly acceptable.
Bzzt, wrong answer.
Partial reads/writes are perfectly possible and non-buggy for any system
that uses variations of mmap/mprotect to implement user-level memory
management, for example persistant data-bases, garbage collection etc.
Which means that if half of a buffer used for "read()" just happens to be
marked non-writable for some GC purpose, the kernel HAS to have the
ability return the right answer (which in this case is to say "I could
only read 1000 bytes"). Because anything else just doesn't give the GC
library (or whatever) any way to recover nicely.
> As a coder, I'd *really* prefer that to hiding the bug!
Rusty, face it, you're wrong on this one. Just drop it.
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!cyclone.bc.net!skynet.be!skynet.be!newsfeed.online.be!newsfeed1.uni2.dk!news.net.uni-c.dk!uninett.no!uio.no!nntp.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
Original-Date: Sat, 18 May 2002 20:05:40 -0700
From: Larry McVoy <l...@bitmover.com>
To: Linus Torvalds <torva...@transmeta.com>
Cc: Rusty Russell <ru...@rustcorp.com.au>, linux-ker...@vger.kernel.org,
a...@lxorguk.ukuu.org.uk
Subject: Re: AUDIT: copy_from_user is a deathtrap.
Original-Message-ID: <20020518200540.N8794@work.bitmover.com>
Mail-Followup-To: Larry McVoy <l...@work.bitmover.com>,
Linus Torvalds <torva...@transmeta.com>,
Rusty Russell <ru...@rustcorp.com.au>, linux-ker...@vger.kernel.org,
a...@lxorguk.ukuu.org.uk
Original-References: <E179GA4-0004ZT...@wagner.rustcorp.com.au> <Pine.LNX.4.44.0205181958140.30454-100...@home.transmeta.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5.1i
In-Reply-To: <Pine.LNX.4.44.0205181958140.30454-100000@home.transmeta.com>; from torvalds@transmeta.com on Sat, May 18, 2002 at 08:01:48PM -0700
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Sun, 19 May 2002 03:06:43 GMT
Message-ID: <fa.ib69oav.1k4mb84@ifi.uio.no>
References: <fa.l3t3oiv.1n1g29r@ifi.uio.no>
Lines: 23
On Sat, May 18, 2002 at 08:01:48PM -0700, Linus Torvalds wrote:
> On Sun, 19 May 2002, Rusty Russell wrote:
> >
> > Huh? No, you ask for 2000 bytes into a buffer that can only take 1000
> > bytes without hitting an unmapped page, returning EFAULT or giving a
> > SIGSEGV is perfectly acceptable.
>
> Bzzt, wrong answer.
Linus is absolutely right. The correct semantics are to return the number
of bytes read, if they are greater than zero, and on the next read return
the error. This has been a corner case in read for a long time in various
Unix versions, and Linus has it right. I went through this back at Sun
and we explored all the different ways, and the bottom line is that you
first ACK that you moved some data and then you NAK on the next read.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!logbridge.uoregon.edu!HSNX.atgi.net!newsfeed.sjc.globix.net!cyclone-sf.pbi.net!216.218.192.242!news.he.net!news-hog.berkeley.edu!ucberkeley!news-feed.ifi.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
From: Rusty Russell <ru...@rustcorp.com.au>
To: Linus Torvalds <torva...@transmeta.com>
Cc: linux-ker...@vger.kernel.org, a...@lxorguk.ukuu.org.uk
Subject: Re: AUDIT: copy_from_user is a deathtrap.
In-reply-to: Your message of "Sat, 18 May 2002 20:01:48 MST."
<Pine.LNX.4.44.0205181958140.30454-100000@home.transmeta.com>
Original-Date: Sun, 19 May 2002 13:31:55 +1000
Original-Message-Id: <E179HQd-0000j7-00@wagner.rustcorp.com.au>
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Sun, 19 May 2002 03:30:56 GMT
Message-ID: <fa.is3g7uv.1ehae@ifi.uio.no>
References: <fa.l3t3oiv.1n1g29r@ifi.uio.no>
Lines: 43
In message <Pine.LNX.4.44.0205181958140.30454-100...@home.transmeta.com> you wr
ite:
>
>
> On Sun, 19 May 2002, Rusty Russell wrote:
> >
> > Huh? No, you ask for 2000 bytes into a buffer that can only take 1000
> > bytes without hitting an unmapped page, returning EFAULT or giving a
> > SIGSEGV is perfectly acceptable.
>
> Bzzt, wrong answer.
>
> Partial reads/writes are perfectly possible and non-buggy for any system
> that uses variations of mmap/mprotect to implement user-level memory
> management, for example persistant data-bases, garbage collection etc.
>
> Which means that if half of a buffer used for "read()" just happens to be
> marked non-writable for some GC purpose, the kernel HAS to have the
> ability return the right answer (which in this case is to say "I could
> only read 1000 bytes"). Because anything else just doesn't give the GC
> library (or whatever) any way to recover nicely.
Um, what about delivering a SIGSEGV? So, copy_to/from_user always
returns 0, but a signal is delivered. Then the only places which need
to be clever are the mount option copying, and the signal delivery
code for SIGSEGV (which uses copy_to_user).
This has the benefit of not breaking existing kernel code, whichever
interpretation of the return value is used.
> > As a coder, I'd *really* prefer that to hiding the bug!
>
> Rusty, face it, you're wrong on this one. Just drop it.
That's certainly possible,
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!cyclone.bc.net!news-hog.berkeley.edu!ucberkeley!news-feed.ifi.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
Original-Date: Sat, 18 May 2002 20:34:08 -0700 (PDT)
From: Linus Torvalds <torva...@transmeta.com>
To: Rusty Russell <ru...@rustcorp.com.au>
cc: linux-ker...@vger.kernel.org, <a...@lxorguk.ukuu.org.uk>
Subject: Re: AUDIT: copy_from_user is a deathtrap.
In-Reply-To: <E179HQd-0000j7-00@wagner.rustcorp.com.au>
Original-Message-ID: <Pine.LNX.4.44.0205182030160.31341-100000@home.transmeta.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Sun, 19 May 2002 03:34:56 GMT
Message-ID: <fa.l2snnpv.1k1421i@ifi.uio.no>
References: <fa.is3g7uv.1ehae@ifi.uio.no>
Lines: 21
On Sun, 19 May 2002, Rusty Russell wrote:
>
> Um, what about delivering a SIGSEGV? So, copy_to/from_user always
> returns 0, but a signal is delivered.
That doesn't help. It's against some stupid SUS rule, I'm afraid.
(And THAT is a stupid rule, I 100% agree with. It means that some things
return -EFAULT, and other things do SIGSEGV, and the only difference is
whether something is a system call or is implemented as a library thing.
UNIX should always just have segfaulted, but there you are..)
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!logbridge.uoregon.edu!tethys.csu.net!news-hog.berkeley.edu!ucberkeley!news-feed.ifi.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
From: Rusty Russell <ru...@rustcorp.com.au>
To: Linus Torvalds <torva...@transmeta.com>
Cc: linux-ker...@vger.kernel.org, a...@lxorguk.ukuu.org.uk
Subject: Re: AUDIT: copy_from_user is a deathtrap.
In-reply-to: Your message of "Sun, 19 May 2002 13:31:55 +1000."
Original-Date: Sun, 19 May 2002 13:38:05 +1000
Original-Message-Id: <E179HWb-0000jY-00@wagner.rustcorp.com.au>
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Sun, 19 May 2002 03:36:23 GMT
Message-ID: <fa.is607ev.7ugqe@ifi.uio.no>
Lines: 24
> Um, what about delivering a SIGSEGV? So, copy_to/from_user always
> returns 0, but a signal is delivered. Then the only places which need
> to be clever are the mount option copying, and the signal delivery
> code for SIGSEGV (which uses copy_to_user).
Sorry, this doesn't work here either: this would return the wrong
value from read.
Of course, everyone who does more than one copy_to_user should be
checking that return value, and not doing:
if (copy_to_user(uptr....)
|| copy_to_user(uptr+10,....)
return -EFAULT
So that such gc schemes actually work,
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!cyclone.bc.net!news-hog.berkeley.edu!ucberkeley!news-feed.ifi.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
From: Rusty Russell <ru...@rustcorp.com.au>
To: Larry McVoy <l...@bitmover.com>
Cc: linux-ker...@vger.kernel.org, a...@lxorguk.ukuu.org.uk
Subject: Re: AUDIT: copy_from_user is a deathtrap.
In-reply-to: Your message of "Sat, 18 May 2002 20:05:40 MST."
<20020518200540.N8794@work.bitmover.com>
Original-Date: Sun, 19 May 2002 14:01:25 +1000
Original-Message-Id: <E179HtC-0001cB-00@wagner.rustcorp.com.au>
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Sun, 19 May 2002 04:00:18 GMT
Message-ID: <fa.iombvmv.4keoif@ifi.uio.no>
References: <fa.ib69oav.1k4mb84@ifi.uio.no>
Lines: 40
In message <20020518200540.N8...@work.bitmover.com> you write:
> On Sat, May 18, 2002 at 08:01:48PM -0700, Linus Torvalds wrote:
> > On Sun, 19 May 2002, Rusty Russell wrote:
> > >
> > > Huh? No, you ask for 2000 bytes into a buffer that can only take 1000
> > > bytes without hitting an unmapped page, returning EFAULT or giving a
> > > SIGSEGV is perfectly acceptable.
> >
> > Bzzt, wrong answer.
>
> Linus is absolutely right. The correct semantics are to return the number
> of bytes read, if they are greater than zero, and on the next read return
> the error. This has been a corner case in read for a long time in various
> Unix versions, and Linus has it right. I went through this back at Sun
> and we explored all the different ways, and the bottom line is that you
> first ACK that you moved some data and then you NAK on the next read.
It's interesting to look at this backwards:
Imagine if copy_to_user returned void and delivered a SIGSEGV
on fault, and always had.
Now, to fix this, we'd want to add new code paths to the 5,500
callers throughout the kernel.
I'm pretty sure everyone would balk. They'd say "sorry, you're going
to have to wrap your syscalls somehow".
But as we all know, it is harder to remove a feature from Linux, than
to get the camel book through the eye of a needle (or something).
Oh well,
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!cyclone.bc.net!news-hog.berkeley.edu!ucberkeley!news-feed.ifi.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
Original-Date: Sat, 18 May 2002 21:02:18 -0700
From: Larry McVoy <l...@bitmover.com>
To: Rusty Russell <ru...@rustcorp.com.au>
Cc: Larry McVoy <l...@bitmover.com>, linux-ker...@vger.kernel.org,
a...@lxorguk.ukuu.org.uk
Subject: Re: AUDIT: copy_from_user is a deathtrap.
Original-Message-ID: <20020518210218.P8794@work.bitmover.com>
Mail-Followup-To: Larry McVoy <l...@work.bitmover.com>,
Rusty Russell <ru...@rustcorp.com.au>,
Larry McVoy <l...@bitmover.com>, linux-ker...@vger.kernel.org,
a...@lxorguk.ukuu.org.uk
Original-References: <20020518200540.N8...@work.bitmover.com> <E179HtC-0001cB...@wagner.rustcorp.com.au>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5.1i
In-Reply-To: <E179HtC-0001cB-00@wagner.rustcorp.com.au>; from rusty@rustcorp.com.au on Sun, May 19, 2002 at 02:01:25PM +1000
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Sun, 19 May 2002 04:03:17 GMT
Message-ID: <fa.i9mro9v.1mk4b8t@ifi.uio.no>
References: <fa.iombvmv.4keoif@ifi.uio.no>
Lines: 18
On Sun, May 19, 2002 at 02:01:25PM +1000, Rusty Russell wrote:
> But as we all know, it is harder to remove a feature from Linux, than
> to get the camel book through the eye of a needle (or something).
It's possible that I'm too tired to have grasped this, but if I have,
you're all wet. In all cases, read needs to return the number of bytes
successfully moved. If you ask for N and 1/2 of the way through N you
are going to get a fault, and you return SEGFAULT, now how can I ever
find out that N/2 bytes actually made it out to me? I want to know that.
If you are arguing that return N/2 is wrong, you are incorrect.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!howland.erols.net!news.net.uni-c.dk!uninett.no!uio.no!nntp.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
Original-Date: Sat, 18 May 2002 22:23:05 -0700 (PDT)
From: Linus Torvalds <torva...@transmeta.com>
To: Rusty Russell <ru...@rustcorp.com.au>
cc: linux-ker...@vger.kernel.org, <a...@lxorguk.ukuu.org.uk>
Subject: Re: AUDIT: copy_from_user is a deathtrap.
In-Reply-To: <E179HWb-0000jY-00@wagner.rustcorp.com.au>
Original-Message-ID: <Pine.LNX.4.44.0205182210330.878-100000@home.transmeta.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Sun, 19 May 2002 05:24:02 GMT
Message-ID: <fa.mljskkv.1d2e2pi@ifi.uio.no>
References: <fa.is607ev.7ugqe@ifi.uio.no>
Lines: 57
On Sun, 19 May 2002, Rusty Russell wrote:
> > returns 0, but a signal is delivered. Then the only places which need
> > to be clever are the mount option copying, and the signal delivery
> > code for SIGSEGV (which uses copy_to_user).
>
> Sorry, this doesn't work here either: this would return the wrong
> value from read.
Oh, read() has to return the right value, but we should _also_ do a
SIGSEGV, in my opinion (it would also catch all those programs that didn't
expect it).
However, that apparently flies in the face of UNIX history and apparently
some standard (whether it was POSIX or SuS or something else, I can't
remember, but that discussion came up earlier..)
> Of course, everyone who does more than one copy_to_user should be
> checking that return value, and not doing:
>
> if (copy_to_user(uptr....)
> || copy_to_user(uptr+10,....)
> return -EFAULT
>
> So that such gc schemes actually work,
Note that _most_ system calls are simply just re-startable, ie if your
"stat()" system call dies half-way and returns EFAULT, you can re-start it
without having to know how much of the "stat" structure you might have
filled in. So for many things a plain -EFAULT is plenty good enough, and
your "copy_to/from_user() should return 0/-EFAULT" would work for them.
But read (and particularly write) are _not_ re-startable without the
knowledge of how much was written, because they change f_pos and other
things ("write()" in particular changes a _lot_ of "other things", namely
the data in the file itself, of course).
There are other system calls that aren't re-startable, but basically
read/write are the "big ones", and thus Linux should try its best to make
them work in an environment that requires restartability. Most programs
can live without various random ioctl's and special system calls, but very
very few programs/environments can live without read/write.
("restartable" here doesn't mean that the _kernel_ would re-start them,
but that a "gc-aware library" can make wrappers around them and correctly
restart them internally, if you see my point - kind of like how stdio
already handles the issue of EINTR returns for read/write, which is
actually very similar in nature).
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!cyclone.bc.net!HSNX.atgi.net!cyclone-sf.pbi.net!216.218.192.242!news.he.net!news-hog.berkeley.edu!ucberkeley!news-feed.ifi.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
From: Benjamin Herrenschmidt <b...@kernel.crashing.org>
To: Linus Torvalds <torva...@transmeta.com>,
Rusty Russell <ru...@rustcorp.com.au>
Cc: <linux-ker...@vger.kernel.org>, <a...@lxorguk.ukuu.org.uk>
Subject: Re: AUDIT: copy_from_user is a deathtrap.
Original-Date: Sat, 18 May 2002 22:47:17 +0100
Original-Message-Id: <20020518214717.3526@smtp.wanadoo.fr>
In-Reply-To: <Pine.LNX.4.44.0205182210330.878-100000@home.transmeta.com>
Original-References: <Pine.LNX.4.44.0205182210330.878-100...@home.transmeta.com>
X-Mailer: CTM PowerMail 3.1.2 F <http://www.ctmdev.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Sun, 19 May 2002 09:54:05 GMT
Message-ID: <fa.ecl09mv.1kigtro@ifi.uio.no>
References: <fa.mljskkv.1d2e2pi@ifi.uio.no>
Lines: 30
>But read (and particularly write) are _not_ re-startable without the
>knowledge of how much was written, because they change f_pos and other
>things ("write()" in particular changes a _lot_ of "other things", namely
>the data in the file itself, of course).
Looking at generic_file_write(), it ignore the count returned by
copy_from_user and always commit a write for the whole requested
count, regardless of how much could actually be read from userland.
The result of copy_from_user is only used as an error condition.
generic_file_read() on the other hand seems to be ok.
>There are other system calls that aren't re-startable, but basically
>read/write are the "big ones", and thus Linux should try its best to make
>them work in an environment that requires restartability. Most programs
>can live without various random ioctl's and special system calls, but very
>very few programs/environments can live without read/write.
>
>("restartable" here doesn't mean that the _kernel_ would re-start them,
>but that a "gc-aware library" can make wrappers around them and correctly
>restart them internally, if you see my point - kind of like how stdio
>already handles the issue of EINTR returns for read/write, which is
>actually very similar in nature).
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!cyclone.bc.net!news-hog.berkeley.edu!ucberkeley!news-feed.ifi.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
Original-Date: Sun, 19 May 2002 11:29:06 -0700 (PDT)
From: Linus Torvalds <torva...@transmeta.com>
To: Benjamin Herrenschmidt <b...@kernel.crashing.org>
cc: Rusty Russell <ru...@rustcorp.com.au>, <linux-ker...@vger.kernel.org>,
<a...@lxorguk.ukuu.org.uk>
Subject: Re: AUDIT: copy_from_user is a deathtrap.
In-Reply-To: <20020518214717.3526@smtp.wanadoo.fr>
Original-Message-ID: <Pine.LNX.4.44.0205191125120.3104-100000@home.transmeta.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Sun, 19 May 2002 18:30:37 GMT
Message-ID: <fa.m6tscqv.140u9gg@ifi.uio.no>
References: <fa.ecl09mv.1kigtro@ifi.uio.no>
Lines: 40
On Sat, 18 May 2002, Benjamin Herrenschmidt wrote:
>
> Looking at generic_file_write(), it ignore the count returned by
> copy_from_user and always commit a write for the whole requested
> count, regardless of how much could actually be read from userland.
> The result of copy_from_user is only used as an error condition.
And this is exactly what makes it re-startable.
A faulting write will fill some subsequent memory area with zeroes, but a
subsequent write can complete the original one.
It has to _commit_ the whole area, because it uses the pre-fault size
information to optimize away reads etc, ie if you do a
write(fd, buf, 4096);
at a page-aligned offset, the write code knows that it shouldn't read the
old contents because they get overwritten.
Which is why we need to commit the whole 4096 bytes, even if we only
actually were able to get a single byte from user space.
But by then telling user space that we couldn't actually write more than 1
byte, we give user space the _ability_ to re-start the write with the
missing 4095 bytes.
> generic_file_read() on the other hand seems to be ok.
That one doesn't have any of the same issues.
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!headwall.stanford.edu!newsfeed.utk.edu!news-hog.berkeley.edu!ucberkeley!news-feed.ifi.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
From: Rusty Russell <ru...@rustcorp.com.au>
To: Linus Torvalds <torva...@transmeta.com>
Cc: Rusty Russell <ru...@rustcorp.com.au>, linux-ker...@vger.kernel.org,
a...@lxorguk.ukuu.org.uk
Subject: Re: AUDIT: copy_from_user is a deathtrap.
In-reply-to: Your message of "Sun, 19 May 2002 11:29:06 MST."
<Pine.LNX.4.44.0205191125120.3104-100000@home.transmeta.com>
Original-Date: Mon, 20 May 2002 12:06:07 +1000
Original-Message-Id: <E179cYq-0004I3-00@wagner.rustcorp.com.au>
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Mon, 20 May 2002 02:04:02 GMT
Message-ID: <fa.ip3ob6v.41mk2a@ifi.uio.no>
References: <fa.m6tscqv.140u9gg@ifi.uio.no>
Lines: 34
In message <Pine.LNX.4.44.0205191125120.3104-100...@home.transmeta.com> you wri
te:
>
>
> On Sat, 18 May 2002, Benjamin Herrenschmidt wrote:
> >
> > Looking at generic_file_write(), it ignore the count returned by
> > copy_from_user and always commit a write for the whole requested
> > count, regardless of how much could actually be read from userland.
> > The result of copy_from_user is only used as an error condition.
>
> And this is exactly what makes it re-startable.
If read always returns the amount read (ignoring any copy_to_user
errors), then you can repeat it by seeking backwards[1] and redoing the
read.
So copy_to_user can simply deliver a SIGSEGV and return "success", and
everything will work (except sockets, pipes, etc).
Is this satisfactory? I'd really like to get rid of 5,500 code paths
in the kernel...
BTW, SuSv3/POSIX.1.2001 says it's OK,
Rusty.
[1] No, this won't work on pipes & sockets, but the whole idea won't
work on many devices anyway...
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!cyclone.bc.net!newsfeed.online.be!130.240.42.8.MISMATCH!luth.se!newsfeed1.uni2.dk!news.net.uni-c.dk!uninett.no!uio.no!nntp.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
Original-Date: Sun, 19 May 2002 19:54:32 -0700 (PDT)
From: Linus Torvalds <torva...@transmeta.com>
To: Rusty Russell <ru...@rustcorp.com.au>
cc: linux-ker...@vger.kernel.org, <a...@lxorguk.ukuu.org.uk>
Subject: Re: AUDIT: copy_from_user is a deathtrap.
In-Reply-To: <E179cYq-0004I3-00@wagner.rustcorp.com.au>
Original-Message-ID: <Pine.LNX.4.44.0205191951460.22433-100000@home.transmeta.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Mon, 20 May 2002 02:55:27 GMT
Message-ID: <fa.l4dbohv.1mhg29l@ifi.uio.no>
References: <fa.ip3ob6v.41mk2a@ifi.uio.no>
Lines: 41
On Mon, 20 May 2002, Rusty Russell wrote:
>
> If read always returns the amount read (ignoring any copy_to_user
> errors), then you can repeat it by seeking backwards[1] and redoing the
> read.
No.
> So copy_to_user can simply deliver a SIGSEGV and return "success", and
> everything will work (except sockets, pipes, etc).
I don't mind the SIGSEGV, but I refuse to make a stupid change that has
absolutely _zero_ reason for it.
The current "copy_to/from_user()" is perfectly fine. It's very simple to
do
if (copy_from_user(xxx))
return -EFAULT;
and it is not AT ALL simpler to do
ret = copy_from_user(xxx);
if (ret)
return ret;
which is apparently your suggestion.
So a lot of people didn't get it? Arnaldo seems to have fixed a lot of
them already, and maybe you who apparently care can add _documentation_,
but the fact is that there is no reason to make a less powerful interface.
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!cyclone.bc.net!news-hog.berkeley.edu!ucberkeley!news-feed.ifi.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
From: Rusty Russell <ru...@rustcorp.com.au>
To: Linus Torvalds <torva...@transmeta.com>
Cc: linux-ker...@vger.kernel.org, a...@lxorguk.ukuu.org.uk
Subject: Re: AUDIT: copy_from_user is a deathtrap.
In-reply-to: Your message of "Sun, 19 May 2002 19:54:32 MST."
<Pine.LNX.4.44.0205191951460.22433-100000@home.transmeta.com>
Original-Date: Mon, 20 May 2002 14:53:18 +1000
Original-Message-Id: <E179fAd-0005vs-00@wagner.rustcorp.com.au>
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Mon, 20 May 2002 04:50:50 GMT
Message-ID: <fa.jh687uv.p46hab@ifi.uio.no>
References: <fa.l4dbohv.1mhg29l@ifi.uio.no>
Lines: 39
In message <Pine.LNX.4.44.0205191951460.22433-100...@home.transmeta.com> you wr
ite:
> ret = copy_from_user(xxx);
> if (ret)
> return ret;
>
> which is apparently your suggestion.
Not quite:
copy_from_user(xxx);
Is my suggestion. No error return.
> So a lot of people didn't get it? Arnaldo seems to have fixed a lot of
> them already
Yeah, thanks to my kernel audit. But I won't be auditing all 5,500
every release (I promised Alan I'd do 2.4 though: I'm waiting for the
next Marcelo kernel).
> and maybe you who apparently care can add _documentation_,
> but the fact is that there is no reason to make a less powerful interface.
It's been documented in the kernel docs. It's also in the device
driver book. And people still get it wrong because it's "special".
Please please please, Linus: to me this is like the min & max macros:
you didn't want a programmer trap in there, but everyone else
disagreed. If there's any sane way we can get rid of this trap (which
has shown to cause real bugs), I would weigh it very carefully.
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!news.tele.dk!small.news.tele.dk!212.74.64.35!colt.net!newsfeed.esat.net!nslave.kpnqwest.net!nloc.kpnqwest.net!nmaster.kpnqwest.net!lois.kpnqwest.se!Norway.EU.net!uninett.no!uio.no!nntp.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
Original-Date: Sun, 19 May 2002 17:12:07 -0300
From: Arnaldo Carvalho de Melo <a...@conectiva.com.br>
To: Rusty Russell <ru...@rustcorp.com.au>
Cc: Linus Torvalds <torva...@transmeta.com>, linux-ker...@vger.kernel.org,
a...@lxorguk.ukuu.org.uk
Subject: Re: AUDIT: copy_from_user is a deathtrap.
Original-Message-ID: <20020519201207.GA6690@conectiva.com.br>
Mail-Followup-To: Arnaldo Carvalho de Melo <a...@conectiva.com.br>,
Rusty Russell <ru...@rustcorp.com.au>,
Linus Torvalds <torva...@transmeta.com>,
linux-ker...@vger.kernel.org, a...@lxorguk.ukuu.org.uk
Original-References: <Pine.LNX.4.44.0205191951460.22433-100...@home.transmeta.com> <E179fAd-0005vs...@wagner.rustcorp.com.au>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <E179fAd-0005vs-00@wagner.rustcorp.com.au>
User-Agent: Mutt/1.3.28i
X-Url: http://advogato.org/person/acme
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Mon, 20 May 2002 05:13:41 GMT
Message-ID: <fa.hq7c16v.1r2gooj@ifi.uio.no>
References: <fa.jh687uv.p46hab@ifi.uio.no>
Lines: 25
Em Mon, May 20, 2002 at 02:53:18PM +1000, Rusty Russell escreveu:
> > So a lot of people didn't get it? Arnaldo seems to have fixed a lot of
> > them already
>
> Yeah, thanks to my kernel audit. But I won't be auditing all 5,500
> every release (I promised Alan I'd do 2.4 though: I'm waiting for the
> next Marcelo kernel).
Yeah, that put the needed pressure for the patches to get accepted ;) I and
others had done most of that in the past but patches were getting lost in the
noise, now they are getting in much more easily 8)
http://kerneljanitors.org/TODO had that listed for a long time 8)
Anyway, thanks again for the audit! :-)
And I'm all for something that is more easy to use, as I, like you, don't
want to keep auditing for the very same thing over and over again.
- Arnaldo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!logbridge.uoregon.edu!news.net.uni-c.dk!uninett.no!uio.no!nntp.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
Original-Date: Mon, 20 May 2002 09:00:53 -0700 (PDT)
From: Linus Torvalds <torva...@transmeta.com>
To: Rusty Russell <ru...@rustcorp.com.au>
cc: linux-ker...@vger.kernel.org, <a...@lxorguk.ukuu.org.uk>
Subject: Re: AUDIT: copy_from_user is a deathtrap.
In-Reply-To: <E179fAd-0005vs-00@wagner.rustcorp.com.au>
Original-Message-ID: <Pine.LNX.4.44.0205200856460.23874-100000@home.transmeta.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Mon, 20 May 2002 16:01:33 GMT
Message-ID: <fa.l4dbppv.1lhg1hv@ifi.uio.no>
References: <fa.jh687uv.p46hab@ifi.uio.no>
Lines: 30
On Mon, 20 May 2002, Rusty Russell wrote:
>
> Not quite:
> copy_from_user(xxx);
>
> Is my suggestion. No error return.
The fact is, that that would still make you have to audit all the users,
AND you'd be left up shit creek for the users who _need_ the error return,
so now you not only have to fix all existing broken stuff, you have to fix
the _correct_ stuff too some strange way. I agree with returning SIGSEGV,
but it is NOT a _replacement_ for getting the right error return from
read/write.
So what's your point? You want to dumb down the interfaces until you can't
make mistakes, and only idiots will be able to use the system.
As long as you continue to push an interface that DOES NOT WORK, there's
no way you can win this argument. read()/write() _needs_ to work, and
that's not a "warm and fuzzy" kind of thing you can play with.
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!news-hog.berkeley.edu!ucberkeley!news-feed.ifi.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
Original-Date: Thu, 16 May 2002 23:53:36 +0000
From: Pavel Machek <pa...@suse.cz>
To: Linus Torvalds <torva...@transmeta.com>
Cc: Rusty Russell <ru...@rustcorp.com.au>, linux-ker...@vger.kernel.org,
a...@lxorguk.ukuu.org.uk
Subject: Re: AUDIT: copy_from_user is a deathtrap.
Original-Message-ID: <20020516235335.C116@toy.ucw.cz>
Original-References: <E179HQd-0000j7...@wagner.rustcorp.com.au> <Pine.LNX.4.44.0205182030160.31341-100...@home.transmeta.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Mutt 1.0.1i
In-Reply-To: <Pine.LNX.4.44.0205182030160.31341-100000@home.transmeta.com>; from torvalds@transmeta.com on Sat, May 18, 2002 at 08:34:08PM -0700
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Tue, 21 May 2002 20:46:27 GMT
Message-ID: <fa.fjc2ovv.18hcpbr@ifi.uio.no>
References: <fa.l2snnpv.1k1421i@ifi.uio.no>
Lines: 34
Hi!
> > Um, what about delivering a SIGSEGV? So, copy_to/from_user always
> > returns 0, but a signal is delivered.
>
> That doesn't help. It's against some stupid SUS rule, I'm afraid.
>
> (And THAT is a stupid rule, I 100% agree with. It means that some things
> return -EFAULT, and other things do SIGSEGV, and the only difference is
> whether something is a system call or is implemented as a library thing.
> UNIX should always just have segfaulted, but there you are..)
I thought POSIX made it explicit that you may SIGSEGV or EFAULT at your
option. If that SUS rule is stupid, we should just drop it.
Performance advantage from MMX-copy-to-user is probably well worth it.
Ouch, and your read example. Imagine you do read (fd, buf, 12000), and first
page of buf is there, second is not and third is [could happen in your GC
case]. What if copy-to-user decides to first write byte 11045 of buffer,
then byte 17, then byte 4875 (and fault)? I think kernel *has* right to
do that. What will it use as return value?
I think such GC library is seriously broken.
Pavel
--
Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt,
details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!logbridge.uoregon.edu!news.net.uni-c.dk!uninett.no!uio.no!nntp.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
X-Authentication-Warning: penguin.transmeta.com: torvalds owned process doing -bs
Original-Date: Tue, 21 May 2002 13:47:21 -0700 (PDT)
From: Linus Torvalds <torva...@transmeta.com>
To: Pavel Machek <pa...@suse.cz>
cc: Rusty Russell <ru...@rustcorp.com.au>, <linux-ker...@vger.kernel.org>,
<a...@lxorguk.ukuu.org.uk>
Subject: Re: AUDIT: copy_from_user is a deathtrap.
In-Reply-To: <20020516235335.C116@toy.ucw.cz>
Original-Message-ID: <Pine.LNX.4.33.0205211340080.3073-100000@penguin.transmeta.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Tue, 21 May 2002 20:50:38 GMT
Message-ID: <fa.o9pvf6v.g4k5rd@ifi.uio.no>
References: <fa.fjc2ovv.18hcpbr@ifi.uio.no>
Lines: 52
On Thu, 16 May 2002, Pavel Machek wrote:
> I thought POSIX made it explicit that you may SIGSEGV or EFAULT at your
> option. If that SUS rule is stupid, we should just drop it.
>
> Performance advantage from MMX-copy-to-user is probably well worth it.
Stop this STUPID "it speeds things up" argument.
It's not TRUE.
We still have to do exactly the same things we do right now, because even
if we SIGSEGV we still have to return the right number of bytes
read/written.
SIGSEGV doesn't mean that the system call wouldn't complete. It removes
_none_ of the kernel fixup handling, because the SIGSEGV won't be
delivered until we return to user mode anyway. So please stop mixing these
two issues up.
There are two completely orthogonal issues:
- Use SIGSEGV on system calls or not.
Using SIGSEGV makes the system call vs library routine issue more
regular, but it does not change the fact that the system call has to
return _something_.
- system call return value for partially successful read/write.
We already have the exact same issue wrt something like SIGINT, and
nobody sane would ever suggest that we always return the "whole" thing
if we're interrupted by an external signal.
Similarly, it's naive and stupid to suggest we return success if we get
interrupted by a SIGSEGV/EFAULT.
On the first issue (SIGSEGV) I'm certainly open to trying that out,
although I'm fairly certain there was _some_ reason we didn't do this a
few years ago.
On the second issue, absolutely _nobody_ has shown any reason why we
should break the existing code that does this correctly, and I've shown
reasons why breaking it is STUPID.
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!logbridge.uoregon.edu!news.net.uni-c.dk!uninett.no!uio.no!nntp.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
Original-Date: Tue, 21 May 2002 23:17:27 +0200
From: Pavel Machek <pa...@suse.cz>
To: Linus Torvalds <torva...@transmeta.com>
Cc: Rusty Russell <ru...@rustcorp.com.au>, linux-ker...@vger.kernel.org,
a...@lxorguk.ukuu.org.uk
Subject: Re: AUDIT: copy_from_user is a deathtrap.
Original-Message-ID: <20020521211727.GG22878@atrey.karlin.mff.cuni.cz>
Original-References: <20020516235335.C...@toy.ucw.cz> <Pine.LNX.4.33.0205211340080.3073-100...@penguin.transmeta.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <Pine.LNX.4.33.0205211340080.3073-100000@penguin.transmeta.com>
User-Agent: Mutt/1.3.27i
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Tue, 21 May 2002 21:19:55 GMT
Message-ID: <fa.j05ubdv.lgg98f@ifi.uio.no>
References: <fa.o9pvf6v.g4k5rd@ifi.uio.no>
Lines: 38
Hi!
> > I thought POSIX made it explicit that you may SIGSEGV or EFAULT at your
> > option. If that SUS rule is stupid, we should just drop it.
> >
> > Performance advantage from MMX-copy-to-user is probably well worth it.
>
> Stop this STUPID "it speeds things up" argument.
>
> It's not TRUE.
>
> We still have to do exactly the same things we do right now, because even
> if we SIGSEGV we still have to return the right number of bytes
> read/written.
>
> SIGSEGV doesn't mean that the system call wouldn't complete. It removes
> _none_ of the kernel fixup handling, because the SIGSEGV won't be
> delivered until we return to user mode anyway. So please stop mixing these
> two issues up.
If you pass bad pointer to memcpy(), you don't expect any reasonable
return value, right?
So if you pass bad pointer to read(), why would you expect "number of
bytes read" return? Its true that kernel can't simply not return
anything, but giving SIGSEGV and returning -EFAULT seems pretty
reasonable to me. If they really want to, they might extract number of
bytes read from address SIGSEGV occured at [but that's dirty hack, and
people will hopefully realise that and not rely on it].
Pavel
--
Casualities in World Trade Center: ~3k dead inside the building,
cryptography in U.S.A. and free speech in Czech Republic.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!logbridge.uoregon.edu!news.net.uni-c.dk!uninett.no!uio.no!nntp.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
X-Authentication-Warning: penguin.transmeta.com: torvalds owned process doing -bs
Original-Date: Tue, 21 May 2002 14:25:46 -0700 (PDT)
From: Linus Torvalds <torva...@transmeta.com>
To: Pavel Machek <pa...@suse.cz>
cc: Rusty Russell <ru...@rustcorp.com.au>, <linux-ker...@vger.kernel.org>,
<a...@lxorguk.ukuu.org.uk>
Subject: Re: AUDIT: copy_from_user is a deathtrap.
In-Reply-To: <20020521211727.GG22878@atrey.karlin.mff.cuni.cz>
Original-Message-ID: <Pine.LNX.4.33.0205211423490.1405-100000@penguin.transmeta.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Tue, 21 May 2002 21:28:04 GMT
Message-ID: <fa.oaqbcuv.j4g5j8@ifi.uio.no>
References: <fa.j05ubdv.lgg98f@ifi.uio.no>
Lines: 27
On Tue, 21 May 2002, Pavel Machek wrote:
>
> If you pass bad pointer to memcpy(), you don't expect any reasonable
> return value, right?
Actually, if I pass a bad pointer to memcpy(), and I have a handler for
the SIGSEGV that fixes up the thing, yes, by golly, I _do_ expect memcpy()
to get the right value.
If it doesn't, then the system is BROKEN.
Face it Pavel, you're WRONG. You're so incredibly wrong that this is not
worth even discussing. Linux does it right now, and I won't break that
correct behaviour just because somebody has a incorrect and silly view of
what is readable.
Face it, copy_to/from_user does the RIGHT THING, and stop whining about
it.
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!cyclone.bc.net!news-hog.berkeley.edu!ucberkeley!news-feed.ifi.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
Subject: Re: AUDIT: copy_from_user is a deathtrap.
To: pa...@suse.cz (Pavel Machek)
Original-Date: Tue, 21 May 2002 22:44:42 +0100 (BST)
Cc: torva...@transmeta.com (Linus Torvalds),
ru...@rustcorp.com.au (Rusty Russell), linux-ker...@vger.kernel.org,
a...@lxorguk.ukuu.org.uk
In-Reply-To: <20020521211727.GG22878@atrey.karlin.mff.cuni.cz> from "Pavel Machek" at May 21, 2002 11:17:27 PM
X-Mailer: ELM [version 2.5 PL6]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Original-Message-Id: <E17AHQw-0000Jq-00@the-village.bc.nu>
From: Alan Cox <a...@lxorguk.ukuu.org.uk>
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Tue, 21 May 2002 21:25:51 GMT
Message-ID: <fa.g53n97v.1562upe@ifi.uio.no>
References: <fa.j05ubdv.lgg98f@ifi.uio.no>
Lines: 13
> So if you pass bad pointer to read(), why would you expect "number of
> bytes read" return? Its true that kernel can't simply not return
Because the standard says either you return the errorcode and no data
is transferred or for a partial I/O you return how much was done. Its
not neccessarily about accuracy either. If you do a 4k copy_from_user and
error after for some reason returning -Esomething thats fine providing you
didnt do anything that consumed data or shifted the file position etc
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!cyclone.bc.net!news-hog.berkeley.edu!ucberkeley!news-feed.ifi.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
Original-Message-ID: <3CEAC020.4F63A181@zip.com.au>
Original-Date: Tue, 21 May 2002 14:46:08 -0700
From: Andrew Morton <a...@zip.com.au>
X-Mailer: Mozilla 4.79 [en] (X11; U; Linux 2.4.19-pre4 i686)
X-Accept-Language: en
MIME-Version: 1.0
To: Alan Cox <a...@lxorguk.ukuu.org.uk>
CC: Pavel Machek <pa...@suse.cz>, Linus Torvalds <torva...@transmeta.com>,
Rusty Russell <ru...@rustcorp.com.au>, linux-ker...@vger.kernel.org
Subject: Re: AUDIT: copy_from_user is a deathtrap.
Original-References: <20020521211727.GG22...@atrey.karlin.mff.cuni.cz> from "Pavel Machek" at May 21, 2002 11:17:27 PM <E17AHQw-0000Jq...@the-village.bc.nu>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Tue, 21 May 2002 21:48:27 GMT
Message-ID: <fa.dff5jlv.30iuau@ifi.uio.no>
References: <fa.g53n97v.1562upe@ifi.uio.no>
Lines: 88
Alan Cox wrote:
>
> > So if you pass bad pointer to read(), why would you expect "number of
> > bytes read" return? Its true that kernel can't simply not return
>
> Because the standard says either you return the errorcode and no data
> is transferred or for a partial I/O you return how much was done. Its
> not neccessarily about accuracy either. If you do a 4k copy_from_user and
> error after for some reason returning -Esomething thats fine providing you
> didnt do anything that consumed data or shifted the file position etc
Is it safe to stick a nose in here with some irrelevancies?
Pavel makes a reasonable point that copy_*_user may elect
to copy the data in something other than strictly ascending
user virtual addresses. In which case it's not possible to return
a sane "how much was copied" number.
And copy_*_user is buggy at present: it doesn't correctly handle
the case where the source and destination of the copy are overlapping
in the same physical page. Example code below. One fix is to
do the copy with descending addresses if src<dest or whatever.
But then how to return the number of bytes??
Also, I see all these noises from x86 gurus about how copy_*_user()
could be sped up heaps with fancy CPU-specific features. Could
someone actually alight from butt and code that up?
akpm-1:/usr/src/ext3/tools> ./copy-user-test foo
aabcddfghhjkllnopprsttvwxx
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h>
#include <sys/mman.h>
int main(int argc, char *argv[])
{
int fd;
char *filename;
char *mapped_mem;
char buf[26];
int i;
if (argc != 2) {
fprintf(stderr, "Usage; %s filename\n", argv[0]);
exit(1);
}
filename = argv[1];
fd = open(filename, O_RDWR|O_TRUNC|O_CREAT, 0666);
if (fd < 0) {
fprintf(stderr, "%s: Cannot open `%s': %s\n",
argv[0], filename, strerror(errno));
exit(1);
}
for (i = 0; i < 26; i++)
buf[i] = 'a' + i;
mapped_mem = mmap(0, 26, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
if (mapped_mem == 0) {
perror("mmap");
exit(1);
}
write(fd, buf, 26);
lseek(fd, 1, SEEK_SET);
write(fd, mapped_mem, 25);
msync(mapped_mem, 26, MS_SYNC);
munmap(mapped_mem, 26);
close(fd);
{
char *p = malloc(strlen(filename) + 20);
sprintf(p, "cat %s ; echo", filename);
system(p);
}
exit(0);
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!logbridge.uoregon.edu!news.net.uni-c.dk!uninett.no!uio.no!nntp.uio.no!ifi.uio.no!internet-mailinglist
Newsgroups: fa.linux.kernel
Return-Path: <linux-kernel-owner+fa.linux.kernel=40ifi.uio...@vger.kernel.org>
X-Authentication-Warning: penguin.transmeta.com: torvalds owned process doing -bs
Original-Date: Tue, 21 May 2002 15:04:22 -0700 (PDT)
From: Linus Torvalds <torva...@transmeta.com>
To: Andrew Morton <a...@zip.com.au>
cc: Alan Cox <a...@lxorguk.ukuu.org.uk>, Pavel Machek <pa...@suse.cz>,
Rusty Russell <ru...@rustcorp.com.au>, <linux-ker...@vger.kernel.org>
Subject: Re: AUDIT: copy_from_user is a deathtrap.
In-Reply-To: <3CEAC020.4F63A181@zip.com.au>
Original-Message-ID: <Pine.LNX.4.33.0205211500530.1307-100000@penguin.transmeta.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-ow...@vger.kernel.org
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
Organization: Internet mailing list
Date: Tue, 21 May 2002 22:05:33 GMT
Message-ID: <fa.ob9vcev.jk8539@ifi.uio.no>
References: <fa.dff5jlv.30iuau@ifi.uio.no>
Lines: 45
On Tue, 21 May 2002, Andrew Morton wrote:
>
> Pavel makes a reasonable point that copy_*_user may elect
> to copy the data in something other than strictly ascending
> user virtual addresses. In which case it's not possible to return
> a sane "how much was copied" number.
I don't agree that that is true.
Do you have _any_ reasonable implementation taht would do that_
> And copy_*_user is buggy at present: it doesn't correctly handle
> the case where the source and destination of the copy are overlapping
> in the same physical page. Example code below.
So we have memcpy() semantics for read()/write(), big deal.
The same way you aren't supposed to use memcpy() for overlapping areas,
you're not supposed to read/write into such areas, for all the same
reasons.
> One fix is to
> do the copy with descending addresses if src<dest or whatever.
No. That wouldn't work anyway, because the addresses are totally different
kinds.
> But then how to return the number of bytes??
The way we do now, which is the CORRECT way.
Stop this idiocy.
The current interface is quite well-defined, and has good semantics. Every
single argument against it has been totally bogus, with no redeeming
values.
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
|