OCamlXSim 3.1 for Mountain Lion
For those interested in building iOS Simulator apps in OCaml 4, I’ve just revamped OCamlXSim 3.1 for the latest OS X release, OS X 10.8 (Mountain Lion). The only difference is in the default iOS SDK, which I changed from iOS 5.1 to iOS 6.0. Otherwise, this was just a recompile.
You can get binary releases of OCamlXSim here:
For information on how to build from sources and how to test an installation, see the updated version of Compile OCaml for iOS Simulator.
If you’re new to this site, you might also be interested in OCamlXARM, a modified version of OCaml 4.00.0 that builds iOS apps. I also revamped it recently to work under Mountain Lion. You can read about it on Compile OCaml for iOS
OCaml Cross Compilation Build Howto
OCamlXSim and OCamlXARM are both cross compilers, and they’re built using exactly the same approach. I think the strategy could be useful for building other OCaml cross compilers, so I thought I’d explain how the build process works in some detail. I’m not claiming that the method is original; however, I did develop it independently and it works for my host and targets.
Since the stock version of OCaml doesn’t want to be a cross compiler,
the overall goal is to beguile it into being one without disrupting the
build process too much. To keep things simple for now, I build a
bytecode cross compiler that generates native code for the target; i.e.,
a cross-compiling version of ocamlopt
. The approach requires that
OCaml already supports the host system with at least a bytecode
implementation, and the target system with a native code implementation.
Building the equivalent “optimized” cross compiler (ocamlopt.opt
)
doesn’t seem too much harder, given a native OCaml compiler for the
host system. I’d like to get this working at some point.
Compiler Source Changes
This note just describes the commands I use to build the cross compilers. It doesn’t describe the changes to the compiler source itself. These will vary a lot depending on the target and the differences between the host and the target.
There are no source changes for OCamlXSIM when building a 32-bit OS X
host executable, because the host and target have virtually identical
properties. Even for a 64-bit OS X executable, the changes are minimal,
because the host and target are quite similar. There is one change in
asmrun/signals_osdep.h
, which must be modified to include the proper
signal handling code in a cross-compiling environment (when the host and
the target architectures are different). Another change in the code
generator makes sure that emitted native int values don’t exceed 32
bits.
The compiler source changes for OCamlXARM are much more extensive, because the iOS target isn’t directly supported in the stock OCaml release. The same signal-handling change was required, and many (reasonably straightforward) changes were required in the emission of assembly code to allow for the particular syntax of the iOS assembler.
In cases where the host and target machines are very different, it may be necessary to make significant changes to the architecture-dependent code that emits instructions and data.
If you’re interested in the exact compiler changes for OCamlXSim or OCamlXARM, see their associated pages (linked above) for a description of how to retrieve the patches.
Ordinary OCaml Build
As a starting point for the build process, consider the ordinary OCaml build process:
$ ./configure
$ make world
$ make opt
The configure
step does many things:
Guess the CPU type and operating system of the host.
Find a C compiler and associated assembler and linker.
Determine properties of the machine (integer sizes, endianness).
Determine properties of the system (available system calls and libraries).
Since OCaml sees itself as a native compiler, all these configuration properties are assumed to apply both to the compiler itself and to the programs it generates. This isn’t the case for a cross compiler, and the key undertaking is to separate the two.
The make world
step builds the bytecode compiler (ocamlc
) and
bytecode runtime. The bytecode runtime consists of a native-code
program named ocamlrun
and a set of dynamically loadable executables
for extra libraries. ocamlrun
, in turn, consists of a bytecode
interpreter and native-code primitives. Each dynamic library contains
bytecode plus extra native-code primitives.
The make opt
step builds the native code compiler (ocamlopt
) and a
native runtime. The native runtime consists of a set of native
libraries, very similar to the bytecode runtime minus the interpreter.
When you do an ordinary compile of an OCaml program with ocamlopt
,
ocamlopt
itself uses the bytecode runtime created in the make world
step. The compiled program links against the native runtime created in
the make opt
step.
Cross Compiling Requirements
To get a cross compiler using the same build system requires a reconsideration of the configuration properties:
The CPU type is used to select the correct native code generator. So the CPU type of the host isn’t so interesting. We want to specify the CPU type of the target.
The C compiler and linker are needed for building the bytecode runtime for the host. However, we also want a target toolchain C compiler, assembler, and linker to be used for generated programs.
Similarly, the machine and system properties are correct for building the bytecode runtime on the host. But we want the target machine and system properties for building the runtime to be used by generated programs.
This suggests a two-phase build process:
Phase 1: run
configure
as usual to determine the properties of the host system. Post-modify the configuration properties just enough to create a native-code cross compiler for the target. Then build the native-code compiler as usual. This native-code compiler runs on the bytecode interpreter (ocamlrun
) of the host, and generates native code for the target.Phase 2: run
configure
on the target system to determine the properties of the target system. Then rebuild just the runtime on the host using the target toolchain and these properties of the target system. The resulting runtime works for the compiled programs.
If the target system is insufficiently Unix-like to run the configure
script, it will be necessary to determine the configuration parameters
by some other method.
This is how both OCamlXARM and OCamlXSim are built. For people really
interested in the details, the following sections show the build process
for OCamlXSim 3.1.7. You’ll find the code in an OS X shell script named
xsim-build
.
Phase 1
The configuration step of Phase 1 looks essentially like this:
export PLT=/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform
export SDK=/Developer/SDKs/iPhoneSimulator6.0.sdk
config1 () {
# Configure for building bytecode interpreter to run on Intel OS X.
# But specify iOSSim parameters for assembly and partial link.
./configure \
-cc "gcc" \
-as "$PLT/Developer/usr/bin/gcc -arch i386 -c" \
-aspp "$PLT/Developer/usr/bin/gcc -arch i386 -c"
# Post-modify config/Makefile to select i386 back end for ocamlopt
# (i386 assembly code).
sed \
-e 's/^ARCH[ ]*=.*/ARCH=i386/' \
-e 's/^MODEL[ ]*=.*/MODEL=default/' \
-e "s#^PARTIALLD[ ]*=.*#PARTIALLD=$PLT/Developer/usr/bin/ld -r#" \
config/Makefile
# Post-modify utils/config.ml.
make utils/config.ml
sed \
-e 's#let[ ][ ]*mkexe[ ]*=.*#let mkexe ="'"$PLT/Developer/usr/bin/gcc -arch i386 -Wl,-objc_abi_version,2 -Wl,-no_pie -gdwarf-2 -isysroot $PLT$SDK"'"#' \
-e 's#let[ ][ ]*bytecomp_c_compiler[ ]*=.*#let bytecomp_c_compiler ="'"$PLT/Developer/usr/bin/gcc -arch i386 -gdwarf-2 -isysroot $PLT$SDK"'"#' \
-e 's#let[ ][ ]*native_c_compiler[ ]*=.*#let native_c_compiler ="'"$PLT/Developer/usr/bin/gcc -arch i386 -gdwarf-2 -isysroot $PLT$SDK"'"#' \
utils/config.ml
}
The configure
step itself specifies the C compiler of the host
(gcc
), which is needed to build the bytecode runtime. The assembler,
however, isn’t needed in this phase. So the configure
step can
specify the target tools for the two types of assembly—in both cases,
it specifies the gcc
of the target toolchain. This means that the
generated cross compiler will run the proper tools when it assembles its
generated native code.
After generating configuration information for the host, the script then
post-modifies it to become a cross compiler. Most importantly, it
modifies config/Makefile
to set its ARCH
variable to the target
architecture. As mentioned above, this is the key step that attaches
the target code generator to the host compiler. The other changes
specify a more particular model of CPU (not really used for OCamlXSim)
and the target tool chain command for doing partial linking.
Note that for OCamlXSim, the target architecture is i386
. The iOS
Simulator is a 32-bit Intel hardware environment with libraries that
recreate the software environment of iOS devices. In the build script
for OCamlXARM, the target architecture is armv7
.
This leaves the question of how the cross compiler should compile any C
programs that are given on its command line, and how it should link the
results into an OCaml executable. These commands are inserted at an
even deeper level, to avoid interfering with the compilation and linking
of the cross compiler runtime. The second set of modifications works by
generating utils/config.ml
and modifying its commands to be those of
the target toolchain.
The build step of Phase 1 looks like this:
build1 () {
# Don't assemble asmrun/i386.S for Phase 1 build. Modify
# asmrun/Makefile temporarily to disable. Be really sure to put
# back for Phase 2.
trap 'mv -f asmrun/Makefile.aside asmrun/Makefile' EXIT
grep -q '^[ ]*ASMOBJS[ ]*=' asmrun/Makefile && \
mv -f asmrun/Makefile asmrun/Makefile.aside
sed -e '/^[ ]*ASMOBJS[ ]*=/s/^/#/' \
asmrun/Makefile.aside > asmrun/Makefile
make world && make opt
mv -f asmrun/Makefile.aside asmrun/Makefile
trap - EXIT
# Save the Phase 1 shared (dynamically loadable) libraries and
# restore them after Phase 2. They're required by some OCaml
# utilities, such as camlp4.
#
find . -name '*.so' -exec mv {} {}phase1 \;
}
This step basically just runs make world
and make opt
as usual.
However, it turns out to be necessary to make some tricky changes before
and after.
First, the assembled output of asmrun/i386.S
won’t be compatible with
the rest of the bytecode runtime. So we remove it from the build rule
of asmrun/Makefile
, and restore it later. This works because this
file is needed only for native executables, and we’re producing only
bytecode executables at this point.
Second, the dynamically loadable libraries of the bytecode runtime will be overwritten during Phase 2. These libraries are needed by the bytecode executables. So we move them aside temporarily, and restore them at the end of Phase 2.
Phase 2
For Phase 2, we’d like to run configure
on our target system. This
can be tricky in general, but for OCamlXSim it’s relatively easy. The
iOS Simulator actually runs as a separate software environment on OS X,
our host system. It’s possible to generate and run code in this
environment by specifying the proper command-line options.
If you aren’t so lucky, the requirement is to generate three files:
config/s.h
, config/m.h
, and config/Makefile
. A possible plan is
to generate these by running configure
on a Unix-like system that’s as
similar as possible to your target, then make any other modifications by
hand.
The configuration step of Phase 2 looks essentially like this:
config2 () {
# Clean out OS X runtime
cd asmrun; make clean; cd ..
cd stdlib; make clean; cd ..
cd otherlibs/bigarray; make clean; cd ../..
cd otherlibs/dynlink; make clean; cd ../..
cd otherlibs/num; make clean; cd ../..
cd otherlibs/str; make clean; cd ../..
cd otherlibs/systhreads; make clean; cd ../..
cd otherlibs/threads; make clean; cd ../..
cd otherlibs/unix; make clean; cd ../..
# Reconfigure for iOSSim environment
./configure \
-host i386-apple-darwin10.0.0d3 \
-cc "$PLT/Developer/usr/bin/gcc -arch i386 -gdwarf-2 -isysroot $PLT$SDK" \
-as "$PLT/Developer/usr/bin/gcc -arch i386 -c" \
-aspp "$PLT/Developer/usr/bin/gcc -arch i386 -c"
# Rebuild ocamlmklib, so libraries work with iOSSim.
rm myocamlbuild_config.ml
cd tools
make ocamlmklib
cd ..
}
The purpose of Phase 2 is to build a runtime for the target. So we start by clearing out the old runtime for the host. Now that we’ve built the cross compiler, it won’t be needed.
Next, we rerun configure
, specifying the C compiler and assembler of
the target toolchain (in our case, the iOS Simulator). We also specify
a specific -host
, so that configure
doesn’t attempt to guess the CPU
and operating system.
Then we rebuild ocamlmklib so it works with the target toolchain rather than the host toolchain.
The build step of Phase 2 looks like this:
build2 () {
# Make iOSSim runtime
cd asmrun; make all; cd ..
cd stdlib; make all allopt; cd ..
cd otherlibs/unix; make all allopt; cd ../..
cd otherlibs/str; make all allopt; cd ../..
cd otherlibs/num; make all allopt; cd ../..
cd otherlibs/dynlink; make all allopt; cd ../..
cd otherlibs/bigarray; make all allopt; cd ../..
cd otherlibs/systhreads; make all allopt; cd ../..
cd otherlibs/threads; make all allopt; cd ../..
# Restore the saved Phase 1 .so files (see above).
find . -name '*.sophase1' -print | \
while read f; do \
fso="$(expr "$f" : '\(.*\)sophase1$')so"; mv -f $f $fso; \
done
}
These commands rebuild the runtime using the new toolchain, then restore
the dynamically loaded libraries of the host runtime that were saved at
the end of Phase 1. These libraries are used by some of the compiling
tools—notably, the camlp4
family uses the Unix library.
Serendipitously, the resulting executables and objects look just like
those of a traditional OCaml release. So they can be installed using
the unmodified install
rule of the top-level Makefile. It works out
this way because there are two distinct parts: the bytecode subsystem
(which works on the host), and the native-code subsystem (which works on
the target). Things don’t have to be separated this way, but it’s
convenient for now.
If you have comments or questions, please leave them below, or email me at jeffsco@psellos.com.
Posted by: Jeffrey