We recently made an important change when obfuscating the runtime,
so that if it's missing any linkname packages in ListedPackages,
it does an extra "go list" call to obtain their information.
This works very well, but we missed an edge case.
In main.go, we disable flagLiterals for the runtime package,
but not for other packages like sync/atomic.
And, since the runtime's extra "go list" has to compute GarbleActionIDs,
it uses the list of garble flags via appendFlags.
Unfortunately, it thinks "-literals" isn't set, when it is,
and the other packages see it as being set.
This discrepancy results in link time errors,
as each end of the linkname obfuscates with a different hash:
> garble -literals build
[stderr]
# test/main
jccGkbFG.(*yijmzGHo).String: relocation target jccGkbFG.e_77sflf not defined
jQg9GEkg.(*NLxfRPAP).pB5p2ZP0: relocation target jQg9GEkg.ce66Fmzl not defined
jQg9GEkg.(*NLxfRPAP).pB5p2ZP0: relocation target jQg9GEkg.e5kPa1qY not defined
jQg9GEkg.(*NLxfRPAP).pB5p2ZP0: relocation target jQg9GEkg.aQ_3sL3Q not defined
jQg9GEkg.(*NLxfRPAP).pB5p2ZP0: relocation target jQg9GEkg.zls3wmws not defined
jQg9GEkg.(*NLxfRPAP).pB5p2ZP0: relocation target jQg9GEkg.g69WgKIS not defined
To fix the problem, treat flagLiterals as read-only after flag.Parse,
just like we already do with the other flags except flagDebugDir.
The code that turned flagLiterals to false is no longer needed,
as literals.Obfuscate is only called when ToObfuscate is true,
and ToObfuscate is false for runtimeAndDeps already.
We went to great lengths to ensure garble builds are reproducible.
This includes how the tool itself works,
as its behavior should be the same given the same inputs.
However, we made one crucial mistake with the runtime package.
It has go:linkname directives pointing at other packages,
and some of those pointed packages aren't its dependencies.
Imagine two scenarios where garble builds the runtime package:
1) We run "garble build runtime". The way we handle linkname directives
calls listPackage on the target package, to obfuscate the target's
import path and object name. However, since we only obtained build
info of runtime and its deps, calls for some linknames such as
listPackage("sync/atomic") will fail. The linkname directive will
leave its target untouched.
2) We run "garble build std". Unlike the first scenario, all listPackage
calls issued by runtime's linkname directives will succeed, so its
linkname directive targets will be obfuscated.
At best, this can result in inconsistent builds, depending on how the
runtime package was built. At worst, the mismatching object names can
result in errors at link time, if the target packages are actually used.
The modified test reproduces the worst case scenario reliably,
when the fix is reverted:
> env GOCACHE=${WORK}/gocache-empty
> garble build -a runtime
> garble build -o=out_rebuild ./stdimporter
[stderr]
# test/main/stdimporter
JZzQivnl.NtQJu0H3: relocation target JZzQivnl.iioHinYT not defined
JZzQivnl.NtQJu0H3.func9: relocation target JZzQivnl.yz5z0NaH not defined
JZzQivnl.(*ypvqhKiQ).String: relocation target JZzQivnl.eVciBQeI not defined
JZzQivnl.(*ypvqhKiQ).PkgPath: relocation target JZzQivnl.eVciBQeI not defined
[...]
The fix consists of two steps. First, if we're building the runtime and
listPackage fails on a package, that means we ran into scenario 1 above.
To avoid the inconsistency, we fill ListedPackages with "go list [...] std".
This means we'll always build runtime as described in scenario 2 above.
Second, when building packages other than the runtime,
we only allow listPackage to succeed if we're listing a dependency of
the current package.
This ensures we won't run into similar reproducibility bugs in the future.
Finally, re-enable test-gotip on CI since this was the last test flake.
Piggybacking off of GOPRIVATE is great for a number of reasons:
* People tend to obfuscate private code, whose package paths will
generally be in GOPRIVATE already
* Its meaning and syntax are well understood
* It allows all the flexibility we need without adding our own env var
or config option
However, using GOPRIVATE directly has one main drawback.
It's fairly common to also want to obfuscate public dependencies,
to make the code in private packages even harder to follow.
However, using "GOPRIVATE=*" will result in two main downsides:
* GONOPROXY defaults to GOPRIVATE, so the proxy would be entirely disabled.
Downloading modules, such as when adding or updating dependencies,
or when the local cache is cold, can be less reliable.
* GONOSUMDB defaults to GOPRIVATE, so the sumdb would be entirely disabled.
Adding entries to go.sum, such as when adding or updating dependencies,
can be less secure.
We will continue to consume GOPRIVATE as a fallback,
but we now expect users to set GOGARBLE instead.
The new logic is documented in the README.
While here, rewrite some uses of "private" with "to obfuscate",
to make the code easier to follow and harder to misunderstand.
Fixes#276.
In the added test case, "garble -literals build" would fail:
--- FAIL: TestScripts/literals (8.29s)
testscript.go:397:
> env GOPRIVATE=test/main
> garble -literals build
[stderr]
# test/main
Usz1FmFm.go:1: cannot call non-function string (type int), declared at Usz1FmFm.go:1
Usz1FmFm.go:1: string is not a type
Usz1FmFm.go:1: cannot call non-function append (type int), declared at Usz1FmFm.go:1
That is, for input code such as:
var append int
println("foo")
_ = append
We'd end up with obfuscated code like:
var append int
println(func() string {
// obfuscation...
x = append(x, ...)
// obfuscation...
return string(x)
})
_ = append
Which would then break, as the code is shadowing the "append" builtin.
To work around this, always obfuscate variable names, so we end up with:
var mwu1xuNz int
println(func() string {
// obfuscation...
x = append(x, ...)
// obfuscation...
return string(x)
})
_ = mwu1xuNz
This change shouldn't make the quality of our obfuscation stronger,
as local variable names do not currently end up in Go binaries.
However, this does make garble more consistent in treating identifiers,
and it completely avoids any issues related to shadowing builtins.
Moreover, this also paves the way for publishing obfuscated source code,
such as #369.
Fixes#417.
We can now use pruned module graphs in go.mod files,
and we no longer need to worry about runtime/internal/sys.
Note that I had to update testdata/mod slightly,
as the new pruned module graphs algorithm downloads an extra go.mod file.
This change also paves the way towards future Go 1.18 support.
Thanks to lu4p for cleaning up two TODOs as well.
Co-Authored-By: lu4p <lu4p@pm.me>
With the -literals flag, we try to convert some const declarations to
vars, as long as that doesn't break typechecking. We really only do that
for typed constant strings, really.
There was a quirk: if a numerical constant had a type and used iota, we
would not obfuscate its value, but we would still convert the
declaration from const to var. Since iotas only work within const
declarations, that would break compilation:
> garble -literals build
[stderr]
# test/main
FeWE3zwi.go:19: undefined: iota
exit status 2
To fix the problem, make the logic more conservative: only obfuscate
constant declarations where the values are typed strings, meaning that
any numerical constants are left entirely untouched.
This fixes the build of google.golang.org/protobuf/runtime/protoiface
with -literals turned on.
Two bugs were remaining which made the build with -literals of std fail.
First, we were ignoring too many objects in constant expressions,
including type names. This resulted in type names declared in
dependencies which were incorrectly not obfuscated in the current
package:
# go/constant
O1ku7TCe.go:1: undefined: alzLJ5Fd.Word
b0ieEGVQ.go:1: undefined: alzLJ5Fd.Word
LEpgYKdb.go:4: undefined: alzLJ5Fd.Word
FkhHJCfm.go:1: undefined: alzLJ5Fd.Word
This edge case is easy to reproduce, so a test case is added to
literals.txt.
The second issue is trickier; in some packages like os/user, we would
get syntax errors because of comments printed out of place:
../tip/os/user/getgrouplist_unix.go:35:130: syntax error: unexpected newline, expecting comma or )
This is a similar kind of error that we tried to fix with e2f06cce94. In
particular, it's fixed by also setting CallExpr.Rparen in withPos. We
also add many other missing Pos fields for good measure, even though
we're not sure they help just yet.
Unfortunately, all my attempts to minimize this into a reproducible
failure have failed. We can't just copy the failing file from os/user,
as it only builds on some OSs. It seems like it was the perfect mix of
cgo (which adds line directive comments) plus unlucky positioning of
literals.
For that last reason, as well as for ensuring that -literals works well
with a wide variety of software, we add a build of all of std with
-literals when not testing with -short. This is akin to what we do in
goprivate.txt, but with the -literals flag. This does make "go test"
more expensive, but also more thorough.
Fixes#285, hopefully for good this time.
The regular obfuscation process simply modifies some simple nodes, such
as identifiers and strings. In those cases, we modify the nodes
in-place, meaning that their positions remain the same. This hasn't
caused any problems.
Literal obfuscation is trickier. Since we replace one expression with an
entirely different one, we use cursor.Replace. The new expression is
entirely made up on the spot, so it lacks position information.
This was causing problems. For example, in the added test input:
> garble -literals build
[stderr]
# test/main
dgcm4t6w.go:3: misplaced compiler directive
dgcm4t6w.go:4: misplaced compiler directive
dgcm4t6w.go:3: misplaced compiler directive
dgcm4t6w.go:6: misplaced compiler directive
dgcm4t6w.go:7: misplaced compiler directive
dgcm4t6w.go:3: misplaced compiler directive
dgcm4t6w.go:9: misplaced compiler directive
dgcm4t6w.go:3: misplaced compiler directive
dgcm4t6w.go:3: too many errors
The build errors are because we'd move the compiler directives, which
makes the compiler unhappy as they must be directly followed by a
function declaration.
The root cause there seems to be that, since the replacement nodes lack
position information, go/printer would try to estimate its printing
position by adding to the last known position. Since -literals adds
code, this would result in the printer position increasing rapidly, and
potentially printing directive comments earlier than needed.
For now, making the replacement nodes have the same position as the
original node seems to stop go/printer from making this mistake.
It's possible that this workaround won't be bulletproof forever, but it
works well for now, and I don't see a simpler workaround right now.
It would be possible to use fancier mechanisms like go/ast.CommentMap or
dave/dst, but those are a significant amount of added complexity as well.
Fixes#285.
Fix up a few TODOs, and simplify the way we handle comments.
We now add whitespace around inline /*line*/ directives, to ensure we
don't break programs. A test case is added too.
We now add line directives to call sites, not function declarations,
since those are what actually shows up in stack traces.
It's unclear if we care about any other lines inside functions at all.
This also fixes reversing with -literals, since that feature adds a
significant amount of code which shuffles line numbers around.
Finally, we extend the tests with types, methods, and anonymous
functions, and we make all of them work well.
Updates #5.
Now that we've dropped support for Go 1.15.x, we can finally rely on
this environment variable for toolexec calls, present in Go 1.16.
Before, we had hacky ways of trying to figure out the current package's
import path, mostly from the -p flag. The biggest rough edge there was
that, for main packages, that was simply the package name, and not its
full import path.
To work around that, we had a restriction on a single main package, so
we could work around that issue. That restriction is now gone.
The new code is simpler, especially because we can set curPkg in a
single place for all toolexec transform funcs.
Since we can always rely on curPkg not being nil now, we can also start
reusing listedPackage.Private and avoid the majority of repeated calls
to isPrivate. The function is cheap, but still not free.
isPrivate itself can also get simpler. We no longer have to worry about
the "main" edge case. Plus, the sanity check for invalid package paths
is now unnecessary; we only got malformed paths from goobj2, and we now
require exact matches with the ImportPath field from "go list -json".
Another effect of clearing up the "main" edge case is that -debugdir now
uses the right directory for main packages. We also start using
consistent debugdir paths in the tests, for the sake of being easier to
read and maintain.
Finally, note that commandReverse did not need the extra call to "go
list -toolexec", as the "shared" call stored in the cache is enough. We
still call toolexecCmd to get said cache, which should probably be
simplified in a future PR.
While at it, replace the use of the "-std" compiler flag with the
Standard field from "go list -json".
This mainly cleans up the few bits of code where we explicitly kept
support for Go 1.15.x. With v0.1.0 released, we can drop support now,
since the next v0.2.0 release will only support Go 1.16.x.
Also updates all modules, including test ones, to 'go 1.16'.
Note that the TOOLEXEC_IMPORTPATH refactor is not done here, despite all
the TODOs about doing so when we drop 1.15 support. This is because that
refactor needs to be done carefully and might have side effects, so it's
best to keep it to a separate commit.
Finally, update the deps.
It's common for asset bundling code generators to produce huge literals,
for example in strings. Our literal obfuscators are meant for relatively
small string-like literals that a human would write, such as URLs, file
paths, and English text.
I ran some quick experiments, and it seems like "garble build -literals"
appears to hang trying to obfuscate literals starting at 5-20KiB. It's
not really hung; it's just doing a lot of busy work obfuscating those
literals. The code it produces is also far from ideal, so it also takes
some time to finally compile.
The generated code also led to crashes. For example, using "garble build
-literals -tiny" on a package containing literals of over a megabyte,
our use of asthelper to remove comments and shuffle line numbers could
run out of stack memory.
This all points in one direction: we never designed "-literals" to deal
with large sizes. Set a source-code-size limit of 2KiB.
We alter the literals.txt test as well, to include a few 128KiB string
literals. Before this fix, "go test" would seemingly hang on that test
for over a minute (I did not wait any longer). With the fix, those large
literals are not obfuscated, so the test ends in its usual 1-3s.
As said in the const comment, I don't believe any of this is a big
problem. Come Go 1.16, most developers should stop using asset-bundling
code generators and use go:embed instead. If we wanted to somehow
obfuscate those, it would be an entirely separate feature.
And, if someone wants to work on obfuscating truly large literals for
any reason, we need good tests and benchmarks to ensure garble does not
consume CPU for minutes or run out of memory.
I also simplified the generate-literals test command. The only argument
that matters to the script is the filename, since it's used later on.
Fixes#178.
In Go 1.15, if a dependency is required but not listed in go.mod/go.sum,
it's resolved and added automatically.
This is changing in 1.16. From that release, one will have to explicitly
update the mod files via 'go mod tidy' or 'go get'.
To get ahead of the curve, start using -mod=readonly to get the same
behavior in 1.15, and fix all existing tests.
The only tests that failed were imports.txt and syntax.txt, the only
ones to require other modules. But since we're here, let's add the 'go'
line to all go.mod files as well.
Use a static main.stderr file, like in the other tests. This means we
don't need to always start the test with a 'go build', and the output is
also obvious by just reading the txtar file.
We can also move generate-literals to a later stage, so that 'go test
-short' needs to do even less work.
'go test -short -run Script/literals' drops from ~0.4s to ~0.2s on my
laptop.
Finally, make the printing of byte lists not use trailing spaces, so
that the txtar file itself doesn't have trailing whitespace in its lines
either.
Fixes#103.
In tiny.txt, we already check line numbers via stderr, so there's no
need to do that via -debugdir.
In syntax.txt, we only really care about what names remain in the
binary, not the names which remain in the source but don't affect the
binary.
These changes are important because -debugdir adds a non-trivial amount
of work, which will impede build caching once that feature lands. We
will likely make -debugdir support build caching eventually, but for
now, this preliminary change will make 'go test' much faster with build
caching.
And of course, the tests get simpler, which is nice.
basic.txt just builds main.go without a module. Similarly, we leave
imports.txt without a GOPRIVATE, to test the 'go list -m' fallback.
For all other tests, explicitly set GOPRIVATE, to avoid two exec calls -
both 'go env GOPRIVATE' as well as 'go list -m'. Each of those calls
takes in the order of 10ms, so saving ~26 exec calls should easily add
to 200-300ms saved from 'go test -short'.
Fixes #2.
Line numbers are now obfuscated, via `//line` comments.
Filenames are now obfuscated via `//line` comments, instead of changing the actual filename.
New flag `-tiny` to reduce the binary size, at the cost of reversibility.
Implement a literal obfuscator interface,
to allow the easy addition of new encodings.
Add literal obfuscation for byte literals.
Choose a random obfuscator on literal obfuscation,
useful when multiple obfuscators are implemented.
Fixes#62