Discussion:
Question regarding crunchgen(1) binaries
(too old to reply)
Shawn Webb
2024-04-14 23:43:15 UTC
Permalink
Hey FreeBSD Hackers,

Note: I originally posted this to the HardenedBSD users mailing list.
I'm posting to freebsd-hackers@ to hopefully learn from a wider
audience.

I wanted to ping the HardenedBSD community, asking about the
usefulness of crunchgen(1)-built applications in 2024.
The main reason to crunch programs together is for fitting as many
programs as possible onto an installation or system recovery floppy.
The binaries in /rescue are built with crunchgen. It seems that
crunchgen-built applications are not (currently) compatible with a
libc built with LTO due to the recent CSU and libc changes.

The size of the binaries in /rescue on HardenedBSD 15-CURRENT/amd64
are 17MB in size. That application size alone makes it impossible to
build a "system recovery floppy". Additionally, floppy drives aren't
all too common on the amd64, arm64, and riscv64 systems HardenedBSD
targets.

Control Flow Integrity (CFI) is a compiler-based exploit mitigation
that we apply to applications in HardenedBSD 15-CURRENT and 14-STABLE.
In order to apply CFI to applications, application code must be built
with Link Time Optimization (LTO).

Over the past few years, I've slowly been working on applying CFI to
shared objects (aka, Cross-DSO CFI). This requires building library
code with LTO as well.

It seems that with the recent changes to the CSU and libc, the
crunchgen(1) built tool does not produce workable applications when
libc is built with LTO. With libc having such a huge surface area, it
would be prudent to apply Cross-DSO CFI to it.

This presents two possible solutions:

1. Enhance crunchgen(1) to support libc built with LTO.
2. Kick crunchgen(1) to the curb.
3. Other ideas from the community are possible.

Does anyone find crunchgen(1) to be truly useful in 2024? If we kick
crunchgen(1) to the curb, we need to modify the build system for
/rescue binaries.

My own preference would indeed to rid ourselves of crunchgen(1) so
that we can progress towards applying Cross-DSO CFI and LTO to libc.

Thanks,
--
Shawn Webb
Cofounder / Security Engineer
HardenedBSD

Tor-ified Signal: +1 303-901-1600 / shawn_webb_opsec.50
https://git.hardenedbsd.org/hardenedbsd/pubkeys/-/raw/master/Shawn_Webb/03A4CBEBB82EA5A67D9F3853FF2E67A277F8E1FA.pub.asc
Warner Losh
2024-04-15 00:06:16 UTC
Permalink
Post by Shawn Webb
Hey FreeBSD Hackers,
Note: I originally posted this to the HardenedBSD users mailing list.
audience.
I wanted to ping the HardenedBSD community, asking about the
usefulness of crunchgen(1)-built applications in 2024.
For FreeBSD they are quite useful still. The HardenedBSD community can make
up its own mind.
Post by Shawn Webb
The main reason to crunch programs together is for fitting as many
programs as possible onto an installation or system recovery floppy.
Floppy is antiquated. /resur hasn't fit on a floppy since FreeBSD 3 or so.
Yet, it's still useful to have a tiny ramdisk that's either recovery or
simple servers only.

The binaries in /rescue are built with crunchgen. It seems that
Post by Shawn Webb
crunchgen-built applications are not (currently) compatible with a
libc built with LTO due to the recent CSU and libc changes.
PR? Seems is aweful vague.

The size of the binaries in /rescue on HardenedBSD 15-CURRENT/amd64
Post by Shawn Webb
are 17MB in size. That application size alone makes it impossible to
build a "system recovery floppy". Additionally, floppy drives aren't
all too common on the amd64, arm64, and riscv64 systems HardenedBSD
targets.
Don't take floppy litterally here.

Control Flow Integrity (CFI) is a compiler-based exploit mitigation
Post by Shawn Webb
that we apply to applications in HardenedBSD 15-CURRENT and 14-STABLE.
In order to apply CFI to applications, application code must be built
with Link Time Optimization (LTO).
Over the past few years, I've slowly been working on applying CFI to
shared objects (aka, Cross-DSO CFI). This requires building library
code with LTO as well.
It seems that with the recent changes to the CSU and libc, the
crunchgen(1) built tool does not produce workable applications when
libc is built with LTO. With libc having such a huge surface area, it
would be prudent to apply Cross-DSO CFI to it.
PR?
Post by Shawn Webb
1. Enhance crunchgen(1) to support libc built with LTO.
2. Kick crunchgen(1) to the curb.
3. Other ideas from the community are possible.
Does anyone find crunchgen(1) to be truly useful in 2024? If we kick
crunchgen(1) to the curb, we need to modify the build system for
/rescue binaries.
My own preference would indeed to rid ourselves of crunchgen(1) so
that we can progress towards applying Cross-DSO CFI and LTO to libc.
/rescue is still the last line of defence against botched libc updates. We
use it for rare cases where /usr isn't mounted yet and moving binaries is
too hard in rc.

Crunchgen'd binaries make the cost trivial. Plus, I have several appliances
that are a RAM disk with one crunchgen binary. It makes deployment easy,
but I know

I'm stongly opposed to kicking crunchgen to the curb. It's still quite
useful and the reasons proffered are vague and seem little more than LTO
bugs...

So I'm for (3) fixing LTO to not suck. The attack mitigation is a good
feature, but it's not worth killing resue over...

But my comments are explicitly in a FreeBSD context.

Warner



Thanks,
Post by Shawn Webb
--
Shawn Webb
Cofounder / Security Engineer
HardenedBSD
Tor-ified Signal: +1 303-901-1600 / shawn_webb_opsec.50
https://git.hardenedbsd.org/hardenedbsd/pubkeys/-/raw/master/Shawn_Webb/03A4CBEBB82EA5A67D9F3853FF2E67A277F8E1FA.pub.asc
Jamie Landeg-Jones
2024-04-15 01:05:31 UTC
Permalink
Post by Shawn Webb
1. Enhance crunchgen(1) to support libc built with LTO.
2. Kick crunchgen(1) to the curb.
3. Other ideas from the community are possible.
Does anyone find crunchgen(1) to be truly useful in 2024? If we kick
crunchgen(1) to the curb, we need to modify the build system for
/rescue binaries.
Please note, my response is not considering the security aspects you raise,
and is only based on the usefulness of /rescue itself.

Do you mean get rid of /rescue, or just getting rid of crunchgen producing
it?

I've been "rescued" by rescue on more than one location - usually systems
that won't mount /usr and also have a screwed up lib.

I wouldn't want to see a static /rescue disappear, and the size would probably
be too large for individual binaries.

Cheers, Jamie


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Shawn Webb
2024-04-15 14:06:17 UTC
Permalink
Post by Jamie Landeg-Jones
Post by Shawn Webb
1. Enhance crunchgen(1) to support libc built with LTO.
2. Kick crunchgen(1) to the curb.
3. Other ideas from the community are possible.
Does anyone find crunchgen(1) to be truly useful in 2024? If we kick
crunchgen(1) to the curb, we need to modify the build system for
/rescue binaries.
Please note, my response is not considering the security aspects you raise,
and is only based on the usefulness of /rescue itself.
Do you mean get rid of /rescue, or just getting rid of crunchgen producing
it?
I recognize now that the way I phrased things left room for ambiguity.
I apologize for the ambiguity.

We do indeed want to keep /rescue around. I still have the occasional
use for it, as do many others.

The only thing that would change would be that the applications in
/rescue would be regular statically-linked executables. We would
stop using crunchgen(1) to produce those executables.
Post by Jamie Landeg-Jones
I've been "rescued" by rescue on more than one location - usually systems
that won't mount /usr and also have a screwed up lib.
I wouldn't want to see a static /rescue disappear, and the size would probably
be too large for individual binaries.
There are around 148 files in my 15-CURRENT/amd64 /rescue. The size
would likely baloon quite drastically.

I think I will likely determine the level of effort to fix
crunchgen(1) to work with LTO-ified libc. I might base my decision off
that.

Meanwhile, if anyone else has any info to pass along that could help
in this journey, I would very much appreciate it. This touches bits
that have a lot of history, and this is definitely a blind spot of
mine.

Thanks,
--
Shawn Webb
Cofounder / Security Engineer
HardenedBSD

Tor-ified Signal: +1 303-901-1600 / shawn_webb_opsec.50
https://git.hardenedbsd.org/hardenedbsd/pubkeys/-/raw/master/Shawn_Webb/03A4CBEBB82EA5A67D9F3853FF2E67A277F8E1FA.pub.asc
Warner Losh
2024-04-15 15:45:59 UTC
Permalink
Post by Jamie Landeg-Jones
Post by Jamie Landeg-Jones
Post by Shawn Webb
1. Enhance crunchgen(1) to support libc built with LTO.
2. Kick crunchgen(1) to the curb.
3. Other ideas from the community are possible.
Does anyone find crunchgen(1) to be truly useful in 2024? If we kick
crunchgen(1) to the curb, we need to modify the build system for
/rescue binaries.
Please note, my response is not considering the security aspects you
raise,
Post by Jamie Landeg-Jones
and is only based on the usefulness of /rescue itself.
Do you mean get rid of /rescue, or just getting rid of crunchgen
producing
Post by Jamie Landeg-Jones
it?
I recognize now that the way I phrased things left room for ambiguity.
I apologize for the ambiguity.
We do indeed want to keep /rescue around. I still have the occasional
use for it, as do many others.
The only thing that would change would be that the applications in
/rescue would be regular statically-linked executables. We would
stop using crunchgen(1) to produce those executables.
I'm going to say what others have said privately: this is a non-starter and
has no support at all. We are not going to stop using crunchgen unless
there is a viable alternative to do the same thing.
Post by Jamie Landeg-Jones
Post by Jamie Landeg-Jones
I've been "rescued" by rescue on more than one location - usually systems
that won't mount /usr and also have a screwed up lib.
I wouldn't want to see a static /rescue disappear, and the size would
probably
Post by Jamie Landeg-Jones
be too large for individual binaries.
There are around 148 files in my 15-CURRENT/amd64 /rescue. The size
would likely baloon quite drastically.
I think I will likely determine the level of effort to fix
crunchgen(1) to work with LTO-ified libc. I might base my decision off
that.
Meanwhile, if anyone else has any info to pass along that could help
in this journey, I would very much appreciate it. This touches bits
that have a lot of history, and this is definitely a blind spot of
mine.
So far all you've said is they appear not to work. Sounts like an LTO bug
in llvm.

My advice: start with a crunchgen'd cat or hello world and see if you can
at least produce a test case that's small and manageable. You can submit
that upstream to see if they can fix it. Or you can chase down in gdb where
this goes off the rails.

At a wild guess, though, you are talking about adding a security package
that makes things safe somehow. That's typically with symbol redirection.
Maybe start there to understand what "LTO" the security thing is doing and
why it's either wrong or violates an assumption in crunchgen that can be
fixed.

Warner


Thanks,
Post by Jamie Landeg-Jones
--
Shawn Webb
Cofounder / Security Engineer
HardenedBSD
Tor-ified Signal: +1 303-901-1600 / shawn_webb_opsec.50
https://git.hardenedbsd.org/hardenedbsd/pubkeys/-/raw/master/Shawn_Webb/03A4CBEBB82EA5A67D9F3853FF2E67A277F8E1FA.pub.asc
Poul-Henning Kamp
2024-04-15 19:55:22 UTC
Permalink
--------
Post by Warner Losh
Maybe start there to understand what "LTO" the security thing is doing and
why it's either wrong or violates an assumption in crunchgen that can be
fixed.
Crunch binaries were invented 30 years ago, to make FreeBSD
installation program fit on a single floppy disk.

Note that the goal was saving disk-space rather than RAM.

The "architecture" of crunchgen is to take a lot of programs, rename
their main() and link them all together with a new main() which
dispatches to the right program's main() based on argv[0]

Statistically you save half a disk-allocation unit for each program
which was nothing to sneeze at, but the real disk-space dividend
comes from linking the resulting combi-program static.

Because it is linked static, only those .o files which are referenced
gets pulled in from the libraries, libm::j0.o only gets pulled in
if you Bessel functions, which, countrary to rumours, sysinstall
did not.

(The goal of shared libraries is saving RAM: Everybody gets the
complete library, but only one copy of it's code ever gets loaded.)

But the real trick is actually not crunchgen, which was originally just
a shell script, but rather crunchide(1).

Crunchide(1) does unnatural acts to an objectfile's symboltabel,
to get around the fact that all the programs have a function called
"main" and that they litter the global symbol namespace with their
private inter-file references.

To make a crunched binary, the .o files for the individual programs
are first "pre-linked" without libraries so that internal interfile
references are resolved.

Then crunchide changes all global symbols, except "main" to be local
symbols, so that they become unavailable for symbol resolution in
the final run of the linker. The "main" symbol is also renamed
to a per-program name, something like "cp_main" for cp(1) etc.

And then all the prelinked .o files, one per program, gets linked
together with the "dispatch main" and this time with libraries.

I see no reason why crunchgen cannot be done with Link Time
Optimization, but somebody has to write the new crunchide(1), and
I suspect it will have a tougher row to hoe, because pre-linking
cannot be used to take care of the inter-program symbols.

As I understand it LTO can also link with "normal libraries"
so one option might be to only LTO the final linking step of
the crunch process, treating all the programs as "normal libraries",
but still getting LTO advantage internally in the libraries.

Poul-Henning
--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
***@FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Julian H. Stacey
2024-04-17 11:53:25 UTC
Permalink
Date: Mon, 15 Apr 2024 19:55:22 +0000
--------
Post by Warner Losh
Maybe start there to understand what "LTO" the security thing is doing and
why it's either wrong or violates an assumption in crunchgen that can be
fixed.
Crunch binaries were invented 30 years ago, to make FreeBSD
installation program fit on a single floppy disk.
Note that the goal was saving disk-space rather than RAM.
The "architecture" of crunchgen is to take a lot of programs, rename
their main() and link them all together with a new main() which
dispatches to the right program's main() based on argv[0]
Statistically you save half a disk-allocation unit for each program
which was nothing to sneeze at, but the real disk-space dividend
comes from linking the resulting combi-program static.
Because it is linked static, only those .o files which are referenced
gets pulled in from the libraries, libm::j0.o only gets pulled in
if you Bessel functions, which, countrary to rumours, sysinstall
did not.
(The goal of shared libraries is saving RAM: Everybody gets the
complete library, but only one copy of it's code ever gets loaded.)
But the real trick is actually not crunchgen, which was originally just
a shell script, but rather crunchide(1).
Crunchide(1) does unnatural acts to an objectfile's symboltabel,
to get around the fact that all the programs have a function called
"main" and that they litter the global symbol namespace with their
private inter-file references.
To make a crunched binary, the .o files for the individual programs
are first "pre-linked" without libraries so that internal interfile
references are resolved.
Then crunchide changes all global symbols, except "main" to be local
symbols, so that they become unavailable for symbol resolution in
the final run of the linker. The "main" symbol is also renamed
to a per-program name, something like "cp_main" for cp(1) etc.
And then all the prelinked .o files, one per program, gets linked
together with the "dispatch main" and this time with libraries.
I see no reason why crunchgen cannot be done with Link Time
Optimization, but somebody has to write the new crunchide(1), and
I suspect it will have a tougher row to hoe, because pre-linking
cannot be used to take care of the inter-program symbols.
As I understand it LTO can also link with "normal libraries"
so one option might be to only LTO the final linking step of
the crunch process, treating all the programs as "normal libraries",
but still getting LTO advantage internally in the libraries.
Poul-Henning
Interesting, Nice if some of that were added to man crunchide.

Cheers,
--
Julian Stacey. Gmail & Googlemail Fail http://berklix.org/jhs/mail/#bad
Brits abroad reclaim http://StolenVotes.UK http://www.gov.uk/register-to-vote
Arm Ukraine defence. Contraception reduces global warming & resource wars.


--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Bakul Shah
2024-05-25 21:44:31 UTC
Permalink
Post by Poul-Henning Kamp
--------
Post by Warner Losh
Maybe start there to understand what "LTO" the security thing is doing and
why it's either wrong or violates an assumption in crunchgen that can be
fixed.
Crunch binaries were invented 30 years ago, to make FreeBSD
installation program fit on a single floppy disk.
Note that the goal was saving disk-space rather than RAM.
The "architecture" of crunchgen is to take a lot of programs, rename
their main() and link them all together with a new main() which
dispatches to the right program's main() based on argv[0]
Statistically you save half a disk-allocation unit for each program
which was nothing to sneeze at, but the real disk-space dividend
comes from linking the resulting combi-program static.
Because it is linked static, only those .o files which are referenced
gets pulled in from the libraries, libm::j0.o only gets pulled in
if you Bessel functions, which, countrary to rumours, sysinstall
did not.
(The goal of shared libraries is saving RAM: Everybody gets the
complete library, but only one copy of it's code ever gets loaded.)
But the real trick is actually not crunchgen, which was originally just
a shell script, but rather crunchide(1).
Crunchide(1) does unnatural acts to an objectfile's symboltabel,
to get around the fact that all the programs have a function called
"main" and that they litter the global symbol namespace with their
private inter-file references.
To make a crunched binary, the .o files for the individual programs
are first "pre-linked" without libraries so that internal interfile
references are resolved.
Then crunchide changes all global symbols, except "main" to be local
symbols, so that they become unavailable for symbol resolution in
the final run of the linker. The "main" symbol is also renamed
to a per-program name, something like "cp_main" for cp(1) etc.
And then all the prelinked .o files, one per program, gets linked
together with the "dispatch main" and this time with libraries.
I see no reason why crunchgen cannot be done with Link Time
Optimization, but somebody has to write the new crunchide(1), and
I suspect it will have a tougher row to hoe, because pre-linking
cannot be used to take care of the inter-program symbols.
As I understand it LTO can also link with "normal libraries"
so one option might be to only LTO the final linking step of
the crunch process, treating all the programs as "normal libraries",
but still getting LTO advantage internally in the libraries.
I'd asked Jaime Da Silva (the original author of crunchgen) about
this. He eventually checked his spam chocked personal domain mbox
and saw my message. He had this to say:

I haven't touched crunch in ~30 years. No doubt "crunchide" is
the problem, zapping symbols needed by CFI and LTO. Assuming
these advanced techniques can work with multiple link passes
("ld -r") then it should be possible to modify crunchide to
rename symbols rather than zapping them.

I am a little surprised crunch is still in use in freebsd.
I think the concept, if it were more flexible, would still
have traction in embedded systems, but everyone seems to be
fine with just using busybox and calling it done.

In case this is useful!



--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Loading...