Alan Somers
2024-04-04 18:14:45 UTC
tldr; there are two problems:
1) tmpfs handles SEEK_HOLE differently than other file systems
2) everything else handles SEEK_HOLE at EOF poorly, IMHO
Details:
According to lseek(2), SEEK_HOLE should return the start of the next
hole greater than or equal to the supplied offset. Also, each file
has a zero-sized virtual hole at the very end of the file. So I would
expect that calling SEEK_HOLE at EOF would return the file's size.
However, the man page also says that SEEK_HOLE will return ENXIO when
the offset points to EOF. Those two statements seem contradictory to
me. The first behavior seems more logical. I would expect SEEK_HOLE
to work the same way both at EOF and at any other file offset.
What does the spec say?
There is no POSIX standard for this. It was invented by Solaris,
Illumos's man page does not say clearly say what should happen at EOF.
Linux's man page is clear: "whence is SEEK_DATA or SEEK_HOLE, and
offset is beyond the end of the file". That would seem to indicate
behavior 1: SEEK_HOLE should return the file's size at EOF. Only
beyond EOF should it return ENXIO.
But what do other implementations do?
Contrary to its man page, Linux behaves mostly like FreeBSD. SEEK_HOLE
returns ENXIO at EOF on most file systems. I tested a number of file
systems on both FreeBSD and Linux. Most of them return ENXIO. The
only two outliers are FreeBSD's tmpfs and Linux's NFS client.
FreeBSD Linux
======= ========= =====
UFS ENXIO
ZFS ENXIO
tmpfs file size ENXIO
msdosfs ENXIO ENXIO
ext2fs ENXIO ENXIO
xfs ENXIO
tarfs ENXIO
nfs ENXIO file size
So what should we change? Clearly, it's bad for tmpfs to be
inconsistent. My preference would be for everything to behave like
tmpfs, but it's currently losing the popularity contest. Anybody else
have thoughts?
-Alan
--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
1) tmpfs handles SEEK_HOLE differently than other file systems
2) everything else handles SEEK_HOLE at EOF poorly, IMHO
Details:
According to lseek(2), SEEK_HOLE should return the start of the next
hole greater than or equal to the supplied offset. Also, each file
has a zero-sized virtual hole at the very end of the file. So I would
expect that calling SEEK_HOLE at EOF would return the file's size.
However, the man page also says that SEEK_HOLE will return ENXIO when
the offset points to EOF. Those two statements seem contradictory to
me. The first behavior seems more logical. I would expect SEEK_HOLE
to work the same way both at EOF and at any other file offset.
What does the spec say?
There is no POSIX standard for this. It was invented by Solaris,
Illumos's man page does not say clearly say what should happen at EOF.
Linux's man page is clear: "whence is SEEK_DATA or SEEK_HOLE, and
offset is beyond the end of the file". That would seem to indicate
behavior 1: SEEK_HOLE should return the file's size at EOF. Only
beyond EOF should it return ENXIO.
But what do other implementations do?
Contrary to its man page, Linux behaves mostly like FreeBSD. SEEK_HOLE
returns ENXIO at EOF on most file systems. I tested a number of file
systems on both FreeBSD and Linux. Most of them return ENXIO. The
only two outliers are FreeBSD's tmpfs and Linux's NFS client.
FreeBSD Linux
======= ========= =====
UFS ENXIO
ZFS ENXIO
tmpfs file size ENXIO
msdosfs ENXIO ENXIO
ext2fs ENXIO ENXIO
xfs ENXIO
tarfs ENXIO
nfs ENXIO file size
So what should we change? Clearly, it's bad for tmpfs to be
inconsistent. My preference would be for everything to behave like
tmpfs, but it's currently losing the popularity contest. Anybody else
have thoughts?
-Alan
--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de