Getting Bap in the Browser 1

Related Articles

In my daily work, I spend a lot of time in love and fighting Bap, The binary analysis platform.

For some reason, I have a brain virus that makes it very convincing to compile things into javascript. I love being able to share. I think if I have an interesting idea or Dad, the chance that someone will download and put together some of my script to try it out is almost 0. But, if it runs directly in their browser, hey, maybe they will try it, appreciate it.

Thus, a very convincing aspect of ocaml is the existence of js_of_ocaml compiler. It is an ocaml bytecode converter produced by ocamlc Into javascript. There are often hiccups in these things.

I really want bap to run in the browser because I think we do cool things at work and want to show it off. It also annoys me that I am developing all this knowledge. Bap has a number of useful IRs, data structures and compiler transitions within it.

BAP is a very complex system. I got Much farther From what I thought to run it in the browser, so I write what I have so far. here is the Github’s repo With the parts and pieces.

It’s been a weird week. I was supposed to go home but my flight was canceled due to omicron covid so I had a pretty desolate week struggling with building systems, reading a bap source and debugging javascript instead. Still ultimately a productive feeling. I had some big wins.

Ocaml has two main compiler openers, ocamlc and ocamlopt. ocamlc Produces bytecode and ocamlopt Produces original code. By and large, I use to create original code. It produces faster code. However, Js_of_ocaml runs on bytecode.

God Guide of ocaml For these tools it is very nice by the way. One should always read such things at least once to know what is there

I realized that use dune Hurt my understanding of the ocaml compilation process, because it handles it for you. This is a problem of all construction tools. Somehow I got this far when I barely understood construction tools and connectors and just splashed my way through google-fu and guesswork. One way to start learning is to use --verbose Which will tell you that the command dune is working. Despite this, the look is huge.

It’s a little surprising if you’ve always been dealing with package managers and building tools what’s the raw way of adding directories to a compiler. The flags for this are quite similar for many compilers. -I Adds to include libraries. For each directory in ocaml, you actually need to write the directory name of the directory to the compiler. Libraries are actually executed in the order you specify them for the compiler. Consider adding print statements to some ocaml files. The order they print depends on the order you submit them to the compiler. It may also help to explain why the dependence of your modules should create a DAG, for which you give the compiler a valid topological order.

Ocaml has several different file types. .cmoAre object files (equivalent to .ml files), .cmi Are an interface (equivalent to .mli ish files), .cma They are archive files that are packages. This is also the case for C compilers, which they have .o, .h, And .a Files, which are quite similar. Local compilers have versions of these .cmx .cmxi .cmxa.

To build something using js_of_ocaml raw, run ocamlc on the different bits to get .cmo Or a .out file. I kept screwing up if I used to .cmo or .out Files. The first is a directory, the second is a linked executable. You can differentiate between these things also by running the command file on them.

ocamlfind Is a separate package locator for ocaml that actually comes by default with opam (I think?). You can find the location of the installed packages ocamlfind query z3 for example. You can also use it alongside other compiler commands. This is very useful and how you can specify directories without finding all the paths and files yourself to add to the compiler.

Js_of_ocaml has two different modes, separate compilation and complete compilation. Separate compilation is probably good for compilation times. It seems that an entire collection is still needed if you want to use dynlinking and topplevel. An entire compilation is achieved in a dune through --release flag.

If you just want to use bap data structures like Var.t, You can get by with the use of the dune and includes a bap as a library. You will need the house (flags -linkall). You can compile for js_of_ocaml by including (modes byte js) And running dune build main.bc.js.

The difference between a --release build and dev build for js_of_ocaml in Dune are very different. I tend to need --release

Complicated bug

My initial experience is just to see what I can do with the compilation in the nose Here I have tried to manually add all the appropriate Javascript buckets. I eventually switched to using a dune.

It stopped at a certain point to complain about the closure of a creek that was already closed. It’s a bit unfortunate in fact that there is a tendency for software engineering to fail gracefully by perceiving all the errors. In fact, it significantly slows down the debugging process. A completely dirty failure gives a stack trace in the actual problem location.

In this case, the problem was actually a missing javascript stub for reading the Unix system utime. This is completely obscured by the mechanisms for handling exceptions. This Utime call was actually in the camlzip directory and took me a bit of digging to find out exactly where.

Add the file

//Provides: unix_utimes
function unix_utimes(x,y,z,w)

And add the house to the dune (js_of_ocaml (javascript_files helpers.js)) Which I understood by examining Example of zarith_stubs_js. This package must be added to bap for javascript compilation.

Dynamic link

I was just a little confused about how the dynlinking would work in js_of_ocaml. Bap uses a plug-in system based on e Dynlink module.

You start a program on the nose, you’re always supposed to call Bap_main.init And really in typical use you have to let Buff do it for you. It loads all the plugins that bap needs to implement the features you need. Bap plugins are zip files which is why camlzip is called up. I’m really shocked that this works, because some external functions of camlzip are marked as still missing. The bap_plugin directory orchestrates this. This in turn uses stuff from the Bap_bundle directory. load In bap_plugin is a ref that can be set to the top dynamic linker interface or to the Dynlink interface. For a moment I thought maybe js_of_ocaml only supports the top dynamic link, but now I’m not so sure. Baptop Uses this mechanism for example

  let loader = Topdirs.dir_load Format.err_formatter in
  setup_dynamic_loader loader;

I asked how to use Link here. Short long story, use complete compilation mode, --toplevel --dynlink +dynlink.js +toplevel.js. In addition, maybe in newer versions of ocaml you need to add Sys.interactive := false. Here a Repo is simple I connected.

open Js_of_ocaml_toplevel
let () = JsooTop.initialize ()

let () = Sys.interactive := false
let () = Printf.printf "%bn" Dynlink.is_native
let _ = Dynlink.loadfile "plugin.cmo"
let () = print_endline "hello"

Stack overflow and Js_of_ocaml

Unfortunately, Javascript and ocaml do not match the expectations of eliminating stack and tail calls. Because of this, it is quite possible to get stack overflow errors.

Apparently Javascript put a tail call elimination specification in the specs quite some time ago, but there was disagreement about it and the browsers didn’t really implement it.

Scala, in my opinion, has a similar problem, being a functional programming language in JVM.

I have encountered this problem in a number of places. The first is a stack overflow in the js_of_ocaml compiler itself when used from javascript while bap is a dynamic linker in directories (bap_c in this case). I made a version of the compiler that went from a recursive non-tail reading to a read that is with an explicit battery parameter. You can find it Here.

Buff recently converted some of his monads to CPS style. This is probably not good without optimization of tail calls. Bap 2.3.0 had an older style that did not encounter these issues in my simple test. Here I followed The lead of JSCoq And added trampolines to these monads.

If you set a thunk data type, you can return early as long as a calculation needs to be made. What it does is release the stack to the point where you are reading trampoline And keep all the necessary data within closures that go through the stack. So at every point comes 'a thunk Calculation, you can make a my_complicated_computation TailCall (fun () => my_complicated_compuation) `This will define a point where you move things from stack to stack.

type 'a thunk =
    | Fin of 'a
    | TailCall of (unit -> 'a thunk)

let rec trampoline r = match r with
    | Fin a -> a
    | TailCall f -> trampoline (f ())

I just added this change to return A case of the Monada, which is enough for a moment. See the baptismal correction here That would have surprised me. I thought I would change bind, But I followed jsCoq.

There is a certain option for the js_of_ocaml flag --disable inline Maybe it will help flood the banks, but I have not seen it ever work.

External file system Js_of_ocaml

Bap needs many files to work properly.

Js_of_ocaml includes a super cool ability to include arbitrary files into a pseudo-file system. Command line flag --file abolsute/path/to/myfile Will include this file in the javascript generated. --file Apparently adds files to a folder named /static/. It should also be added to the bap’s various internal search paths.

It was a nice opportunity to use strace. I once had something that worked in my command line node But not in the browser, I could apply all the file openings to try to include them --file one by one.

strace -e trace=open,openat,close,connect,accept dune exec ./main.bc

Strass is quite a useful thing sometimes. It keeps track of system calls, which are the portal from your program to the world.

Js of ocaml troubleshooting tips

Be sure to use all the beautiful command line options. Your life would be impossible without them
--pretty --debuginfo --source-map-inline. You have to run ocamlc With the creation of debugging -g.

I tended to use node To make sure things work in the beginning. node Allows you to increase the stack size with stack-size=10000. Browsers do not. node It’s also faster probably.

Surprisingly, I found it very helpful to edit the generated _build/default/main.bc.js Directly in hand do console.log Debugging, because some errors came from directories that were very annoying to print ocaml-level debugging.

The actual error that was printed was often discarded and caught from somewhere. Gripping for this string in main.bc.js and gripping for RangeError And add console.log(exn.stack) In these places it was wonderfully helpful.

For stack overflow it is often helpful Error.stackTraceLimit = Infinity; To get all the traces.

TODO: The actual hard parts

Write the llvm disassembler / ghidra / z3 and then connect these to js_of_ocaml.

However, only with the Primus Lisp loaders, I think I can do a lot of fun things. Who cares about binaries at all



Please enter your comment!
Please enter your name here

Popular Articles