Hacking and upgrading the Korg Kronos

From a 1.66GHz Atom to a 3.9GHz Skylake

The Korg Kronos is an interesting beast. Korg’s flagship synth, it’s marketed as a music workstation, but under the hood it’s actually built on a commodity x86 motherboard and some custom I/O hardware. And it runs Linux.

That’s just asking to be hacked, isn’t it? :-)

I’ve owned one of these for a few years, and at one point did some investigation into the software, but got bored and never really did anything interesting with it (to be fair, it’s already complicated enough with the stock software!). Recently, though, I decided to take a look at it again, and I ended up stumbling upon a year-old blog called kronoshacker.

This anonymous hacker has been documenting all kinds of details about the Kronos software, from security to obfuscation to performance. He upgraded his Kronos by replacing the motherboard and CPU (originally an outdated Atom) with a bleeding-edge Skylake CPU, and massively improving his unit’s performance. I decided to try to do the same, and document the process.

Kronos hardware primer

The original Kronos motherboard

The Kronos basically consists of two major components: the x86, and the NKS4. The x86 is a bog-standard Intel D510MO motherboard with a dual-core Atom D510 at 1.66GHz. It runs Linux, and all of the synthesizing and signal processing happens here. However, its I/O is mostly unused; it’s really just the “brain” of the device. Software, samples, and user data are stored on an internal SATA SSD, 30GB stock. The motherboard comes with 2GB of RAM stock, but can be upgraded officially to 3GB. Since the CPU and software are 32-bit only, using more than 4GB of RAM is impossible. The system peripherals reserve some address space, but it is possible to install 4GB of RAM and have the synth recognize a few hundred megabytes of extra memory more than a 3GB unit.

Most of the I/O happens via the NKS4. The NKS4 is a custom board based around a TI AM1806 processor. It handles the touchscreen (framebuffer and input), all of the buttons and LEDs, and audio I/O. It runs a fairly simple firmware, and connects to the x86 via a USB port. The firmware is upgradable, but it is not upgraded internally. Instead, to upgrade it, you hold down ENTER + PAUSE while turning the unit on, which puts it into a bootloader mode. You then connect the USB device port to a PC, and run a utility that Korg provides that will update the firmware (apparently via the USB MIDI interface). This seems kind of silly, since it would make more sense for the NKS4 to be updated internally during the Kronos’ firmware update procedure, but oh well.

The keybed is connected separately to the x86, via a RS232 port. I’m not sure if this is still handled by the NKS4 or by a separate microcontroller. It is controlled by a separate microcontroller. Perhaps they decided to use the RS232 port to shave off some latency, compared to sending the keybed data through the NKS4 and USB.

Software shenanigans

The Kronos runs a standard BIOS, and boots normally, using plain old GRUB and no secure boot. The kernel is a bit special, though: it includes realtime patches (RTAI) and some Korg-custom modifications.

I’ll skip the rant about Korg violating the GPL. Suffice it to say, they do it in quite a few places, from undisclosed kernel patches to a shim module that re-exports GPL-only symbols as non-GPL to GPLed userspace apps with patches and no source. They make a token effort to distribute the kernel source and incomplete patches in a DVD shipped with the unit, but I digress.

The kernel boots into a stripped-down Linux distribution (RedHat/Fedora based by the looks of it, but with very little of that identity left), with only minimal userspace and init. It then does some basic init, then eventually starts loading kernel modules and starting userspace daemons. There are two main components to the Kronos runtime: OA.ko and Eva. OA.ko is the main synthesizer kernel module, and does all of the actual audio processing in kernelspace (using realtime services provided by RTAI). Eva is the user interface, and runs in userspace. The UI is displayed via a framebuffer exported by the NKS4 over USB, and available as /dev/fb1.

Additional userspace daemons handle miscellaneous things, such as vsftpd for the built-in FTP server, avahi for mDNS, and ifplugd to handle network interfaces. There is also an amusing fanctrld which, as far as I can tell, does nothing other than daemonize and run sleep(1) in a loop. I guess it used to be a fan control daemon in a past life and was replaced with a no-op at some point…

The synthesizer modules, userspace software, and some samples are actually stored in encrypted filesystem images—this is part of the Kronos’ security system. The keys for those images are stored in the /.pairFact3 file, themselves encrypted. The unwrapped keys are obtained by doing a key exchange with an Atmel AT88SC0204CA CryptoMemory chip that is connected to and available via the NKS4. The loadmod.ko kernel module is in charge of doing this, and it also contains other bits and pieces of obfuscation and security—in particular, it performs a filesystem integrity check and refuses to work if it detects tampering.

The filesystem images are just set up as standard cryptoloop images, with AES-256-CBC encryption… though they actually use 124-bit keys converted to ASCII hexadecimal form as the 256-bit AES key. Yes, 124-bit, not 128-bit, because they have a bug that eats the last nybble of the original 128-bit keys. The three images and their respective mountpoints are:

  • /korg/ro/Eva.img/korg/Eva
  • /korg/ro/Mod.img/korg/Mod
  • /korg/ro/WaveMotion.img/korg/rw/PCM/WaveMotion

This whole security mechanism is relatively pointless, because getting a root shell isn’t hard (as we will see), and once you have root, you can just nicely ask loadmod.ko to mount the encrypted filesystems and copy their contents out. Getting the image encryption keys (which are the same for every Kronos) is also easy and can be done in a few ways (you can even fit one of them in a tweet, twice), but I’ll leave that as an exercise for the reader. Suffice it to say that the keys are 31-character hex strings.kronoshacker updated his blog and posted the keys and one method to get them at just about the same time as I published this post, so if you want the keys just follow that link. He used a kernel patch, but my two simpler methods are to simply use the LOOP_GET_STATUS64 ioctl from userspace (it happily gives you the key), or, if you want your keydumper to fit in a tweet, just do (after mounting the filesystems):

# busybox mknod /dev/mem c 1 1
# busybox strings /dev/mem | grep -E '^[0-9a-f]{31}$' | busybox tail -n 3

Citations

kronoshacker covered quite a few topics pretty well, so you should head over to that blog now if you want the details on:

Word of warning

If you attempt to do any of this to your Kronos and mess it up, that’s your fault, not mine. Do not follow any of these steps if you don’t have at least a reasonable knowledge of Linux administration. Back up your SSD contents first. I am not responsible if your Kronos stops working, catches fire, starts making dubstep, or transforms into a Roland.

In writing this post, I assume you at least know what a kernel is, how to use standard Linux command-line tools, and what busybox is and how to use it to replace missing system tools.

Getting Started: Rooting the Kronos

kronoshacker already gave us a ready-made update file that roots the Kronos and installs an SSH daemon. However, he didn’t explain how it works, so I will :-).

loadmod.ko performs a filesystem integrity check, which among other things means that it will refuse to work if you attempt to change files such as /etc/passwd or /etc/inittab, stopping trivial modification of the boot process. However, it only checks the integrity of its list of existing, expected files; it doesn’t know about or care about any additional files. It also doesn’t check anything under /boot, since that isn’t normally mounted.

kronoshacker’s approach is thus simple: instead of modifying the existing init system, install a parallel init system. The update package installs busybox and symlinks it to /bin/init (vs. the original /sbin/init). Busybox is patched to use /etc/inittab.busybox instead of /etc/inittab. Since the GRUB config isn’t checked, the kernel can be told to use init=/bin/init instead, which launches the parallel init system. Since the original files are untouched, loadmod.ko is none the wiser. The SSH daemon is dropbear, patched to use the Kronos’ FTP credentials instead of /etc/passwd for login, so you can conveniently change the root SSH password from the UI.

Defanging the Kronos security: loadmod.ko and encryption

loadmod.ko’s primary job is to do the crypto exchange, compute the keys for the encrypted loopback images, and mount them. Therefore, by copying the contents out, we should be able to get rid of it, avoiding the filesystem integrity check and speeding up boot.

However, getting rid of loadmod.ko will break audio synthesis in OA.ko. This is, again, part of the obfuscation. loadmod.ko pokes a magic value into some magic kernel state (part of the unpublished Korg kernel patches) that is checked by OA.ko. If OA.ko doesn’t find the right value, it intentionally cripples the audio synth. This is done in a somewhat amusing way: the module contains two hardcoded 128-bit SSE float constants, {1., 1., 1., 1.} and {-1., -1., -1., -1.}, appropriately called allPlusOne and allMinusOne (yes, they distribute the module with full symbols…), which are used all over the synth as constant values. If the magic value check fails, these are overwritten with {0.7, 0.7, 0.7, 0.7} and {-0.2, -0.2, -0.2, -0.2} respectively, which needless to say does very bad things to the audio.

kronoshacker already took care of this by building a custom kernel that already incorporates the magic value in the right place to satisfy OA.ko. You’ll find his prebuilt kernel, patchset, and explanation here.

Installing this kernel is simply a matter of mounting /boot, copying it there, and updating the GRUB config (/boot/grub/grub.conf) to use it:

default=0
timeout=3

title kronos-bzImage-20160505 + busybox
        root (hd0,0)
        kernel /kronos-bzImage-20160505 root=/dev/sda2 max_loop=16 fbcon=map:0 memmap=384m vga=0x0303 loglevel=0 fastboot Single raid=noautodetect elevator=noop init=/bin/init

title Linux (2.6.32.11)-320m STG 8x6 + busybox
        root (hd0,0)
        kernel /bzImage root=/dev/sda2 max_loop=16 fbcon=map:0 memmap=384m vga=0x0303 loglevel=0 fastboot Single raid=noautodetect elevator=noop init=/bin/init

title Linux (2.6.32.11)-320m STG 8x6
        root (hd0,0)
        kernel /bzImage root=/dev/sda2 max_loop=16 fbcon=map:0 memmap=384m vga=0x0303 loglevel=8 fastboot Single raid=noautodetect elevator=noop

After doing this, we can safely get rid of loadmod.ko. To do that, we will first copy the encrypted images out. From a booted Kronos:

# mount /korg/Mod
# mount -o remount,rw /korg/ro
# cp -vr /korg/Mod /korg/Eva /korg/ro/
# cp -vr /korg/rw/PCM/WaveMotion /korg/rw/PCM/WaveMotion2

/korg/ro doesn’t have enough free space by default, so I decided to put the WaveMotion data in /korg/rw instead, next to where it gets mounted.

Note that /korg/Eva can only be mounted once (loadmod.ko wipes its encryption key after the first mount), but the other images can be repeatedly mounted/unmounted. Conversely, /korg/Mod is unmounted after the kernel modules it contains are loaded, but can be freely re-mounted, while /korg/Eva has to stay mounted so the userspace software can run. One wonders if Korg meant to add the single-mount protection to /korg/Mod instead, but botched it…

Next, we need to replace /sbin/loadoa, which is really just a C program trying to be a shell script (seriously, it’s full of popen calls to shell pipelines and the like…) with an actual shell script, and modify it to not load loadmod.ko, but instead just bindmount the copied-out data over the final mountpoints. I put together a /sbin/loadoa.sh that does just that. Copy it into the Kronos, make sure to chmod 775 /sbin/loadoa.sh, and then edit /etc/OA.clonos.rc (the alternate version of /etc/OA.rc installed by kronoshacker’s root pack) to call it instead of /sbin/loadoa.

The Plan

With the nasty bits out of the way, it’s time to plan out the hardware upgrade. kronoshacker documented his choice of hardware, which I used as a baseline. This is what I came up with:

The CPU is the same one that kronoshacker used (he also prepared a ready-made CostProfile for it, which tells OA.ko about the CPU’s performance so it can perform voice stealing correctly). kronoshacker’s Supermicro motherboard is expensive and not readily available where I live, so instead I opted for a more widely available, consumer ASUS motherboard. The ASUS motherboard does have a serial port header (in fact, most modern motherboards still have one of those, just not an actual D-SUB9 connector), so the Kronos’ serial port cable should plug in directly (kronoshacker had to solder it to a D-SUB).

I took a bit of a gamble with the serial port. As kronoshacker explained, OA.ko talks directly to the serial hardware (Super I/O chip) and expects a particular kind of chip, going beyond standard 16550 registers. kronoshacker confirmed compatibility with the Winbond W83627 (original Kronos), Nuvoton NCT6627UD (Kronos 2), and Nuvoton NCT6776 (his replacement motherboard). My motherboard choice also has a Nuvoton chip, but a different one (NCT5539D), and I wasn’t sure if it was going to be compatible out of the box or if I’d have to patch support into OA.ko. Thankfully, it did in fact just work.

The 8GB of RAM is overkill; the 32-bit OS can barely use more than 3GB normally. However, I want to try perhaps building a PAE kernel or other experiments with the high RAM (running Kronos virtualized, anyone?), and besides, buying less than 8GB of RAM just feels silly these days.

Take it apart

At this point I realized that the PSU in the Kronos does not have an ATX12V connector. However, the connector on the PSU side has an unpopulated pin that happens to be another 12V line, so I decided to use that. I cannibalized an ATX12V connector and cable from an old PSU, and cannibalized the required pin to fit the Kronos’ connector from the AUX connector of an even older PSU (which happened to have the right kind of pins). There were no unpopulated grounds, and I didn’t want to mangle the original wiring too much, so instead I decided to use a screw terminal and wire up the grounds to one of the PSU’s screws (which are grounded).

Slight ATX12V-related detour.

I also decided to add external HDMI and Ethernet connections, so I could control the boot process with the cover on. The Kronos normally uses an external USB Ethernet adapter for the (optional) network, but the internal port (which is normally not connected) is also functional and will DHCP off the network.

The original kernel does boot on the new hardware. However, that kernel doesn’t have functional USB xHCI support, so it will never be able to run the synth on the new hardware. Nonetheless, if used with a PS/2 keyboard, it does give you a usable terminal. kronoshacker’s kernel includes a backported xHCI driver that works great. The ASUS motherboard has a similar r8169-class Ethernet chip as the original Atom motherboard, so that works just fine without any additional modules.

One BIOS setting is important: the CPU C states above C2 must be turned off, as otherwise they cause random hangs while loading the RTAI modules. I also took the chance to upgrade the BIOS to the latest version.

With the new hardware, the Kronos is several times more powerful than it was originally, and pretty much any possible combination will reach the hard cap of 200 voices before it maxes out the CPU.

However, there is one disappointment: with the new hardware, the Kronos has (slightly) less available sample memory than with the old one. What’s going on?

TOLUD or not TOLUD

Why does the new hardware limit memory further? Well, on a 32-bit system, the kernel can only use memory visible below 4GB. But system peripherals and other special address ranges also need to be mapped below 4GB. To accomplish this, the BIOS will take a chunk of memory off of the top of the address space, and either disable it or remap it above 4GB (depending on the BIOS config), and then use the corresponding space for peripherals and other system ranges. This is controlled by the TOLUD register in the chipset config, and has become known after it.

My motherboard configures TOLUD to 3 GiB by default, and the maximum available RAM address is at 2974 MiB after some ACPI overhead:

BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009c800 (usable)
 BIOS-e820: 000000000009c800 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000b9ea6000 (usable)
 BIOS-e820: 00000000b9ea6000 - 00000000b9ea7000 (ACPI NVS)
 [...]
 BIOS-e820: 00000000bffff000 - 00000000c0000000 (usable)

Unfortunately, while kronoshacker’s motherboard has a user-configurable TOLUD setting, the ASUS one does not—and it sports a default setting that reserves more memory than the original motherboard. This seems like it would require a BIOS patch to fix, but fortunately, there is an easier way. To see what’s going on, we’re going to have to look at the BIOS image.

UEFI BIOSes are fairly complicated, containing many modules and nested data structures. There are tools such as UEFITool that make the process of messing with UEFI BIOS images a lot easier.

After loading the ASUS BIOS into UEFITool, we can extract the Setup module PEI image. UEFI Setup screens are described using a kind of bytecode language. Thankfully, there are tools available to automatically decompile this too; I used Universal IFR Extractor.

Searching for TOLUD in the decompiled Setup IFR we find that, in fact, the setting is there:

Suppress If: {0A 82}
    True {46 02}
    Setting: Max TOLUD, Variable: 0x483 {05 91 FC 04 FD 04 20 06 01 00 83 04 10 10 00 0B 00}
        Default: 8 Bit, Value: 0x0 {5B 06 00 00 00 00}
        Option: Dynamic, Value: 0x0 {09 07 FE 04 00 00 00}
        Option: 1 GB, Value: 0x1 {09 07 09 05 00 00 01}
        Option: 1.25 GB, Value: 0x2 {09 07 08 05 00 00 02}
        Option: 1.5 GB, Value: 0x3 {09 07 07 05 00 00 03}
        Option: 1.75 GB, Value: 0x4 {09 07 06 05 00 00 04}
        Option: 2 GB, Value: 0x5 {09 07 05 05 00 00 05}
        Option: 2.25 GB, Value: 0x6 {09 07 04 05 00 00 06}
        Option: 2.5 GB, Value: 0x7 {09 07 03 05 00 00 07}
        Option: 2.75 GB, Value: 0x8 {09 07 02 05 00 00 08}
        Option: 3 GB, Value: 0x9 {09 07 01 05 00 00 09}
        Option: 3.25 GB, Value: 0xA {09 07 00 05 00 00 0A}
        Option: 3.5 GB, Value: 0xB {09 07 FF 04 00 00 0B}
    End of Options {29 02}

But it is permanently suppressed (i.e. hidden): Suppress If: True. Bummer.

Although we can’t access the Setup option to change the TOLUD setting, the option, and its corresponding config variable, do exist. Therefore, we can manually change the value of the option without using the Setup utility. The values are stored in the Setup UEFI variable:

    Default Store:  0x1 {5C 06 02 00 01 00}
    Var Store: 0x1[3597] (Setup) {24 1C 43 D6 87 EC A4 EB B5 4B A1 E5 3F 3E 36 B2 0D A9 01 00 0D 0E 53 65 74 75 70 00}

One easy way to change UEFI variables is to boot into a UEFI-enabled Linux kernel (in UEFI mode) and use efivarfs. I used my trusty SystemRescueCD USB stick (which supports UEFI mode). After booting into a compatible kernel, we can see the Setup variable:

% cd /sys/firmware/efi/efivars
% ls -al Setup-*
-rw-r--r-- 1 root root  3601 May 31 21:16 Setup-ec87d643-eba4-4bb5-a1e5-3f3e36b20da9

A bit of manual sanity-checking reveals that the variable offsets specified in the IFR data do not directly correspond to byte offsets in the variable, but rather, to byte offsets starting from offset 4. Therefore, our variable is at offset 0x483 + 4. The value that we want to write is 0xb (3.5 GiB):

% cp Setup-ec87d643-eba4-4bb5-a1e5-3f3e36b20da9 ~/Setup
% cd
% cp Setup Setup.bak
% echo -ne '\x0b' | dd of=Setup bs=1 seek=$((0x483 + 4)) conv=notrunc
% cat Setup >/sys/firmware/efi/efivars/Setup-ec87d643-eba4-4bb5-a1e5-3f3e36b20da9

Rebooting back into the Kronos system confirms that the fix worked:

BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009c800 (usable)
 BIOS-e820: 000000000009c800 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000d0d53000 (usable)
 BIOS-e820: 00000000d0d53000 - 00000000d0d54000 (ACPI NVS)
 [...]
 BIOS-e820: 00000000d7fff000 - 00000000d8000000 (usable)

This is actually a TOLUD config of 3.375 GiB instead of 3.5 GiB, but better than nothing. This should give the Kronos 3341 MiB of memory. And yet… usable memory increased slightly, but not as much as expected. In fact, we expect about 2957 MiB of usable heap (3341 MiB - 384 MiB), but OA.ko reports getting about 2619 MiB of memory:

5b000000..feb53000 is ioremapped memory, 2746560512 bytes, physicalBankStart @ 0x18800000

OA.ko memory management

To understand how to optimize the available synthesizer memory, we need to look at how the Kronos does memory management. Instead of managing most of the RAM using standard Linux kernel mechanisms, the Kronos is set up to boot with only the low 384 MiB of RAM enabled (as specified with the memmap=384M commandline option). Then, OA.ko takes the rest of available memory, maps it as one huge physically contiguous block into kernel space, and performs its own memory management on it.

To accomplish this, the kernel is built using a somewhat nonstandard 1G/3G user/kernel memory split. That means that, of the 4 GiB of address space available on a 32-bit system, the first 1 GiB is allocated to userspace, and the last 3 GiB are allocated to kernelspace. This allows it to map up to nearly 3GiB of synth heap into kernelspace.

I have no clue why they did it this way: normally, physically-contiguous memory is required for dumb hardware that doesn’t support scatter-gather DMA, but in this case the memory is only really used by the x86 itself. Using physically contiguous memory affords little to no benefit here over standard paged memory (it still goes through the page tables to get mapped anyway). My only guess is that the OA synth is a descendant of other Korg synth products that run directly on bare metal with no MMU, and are designed to work directly with physical memory, and Korg just carried over this architecture to the Kronos.

Either way, the end result is that the kernel virtual address space spans the 3 GiB range at 0x40000000–0xffffffff. However, the kernel by default maps all available RAM at the bottom of its address space. Since the Kronos tells the kernel to use 384MiB of RAM, that leaves 3 GiB - 384 MiB = 2688 MiB of available kernel address space to map the synthesizer heap. This is where our memory limitation is coming from.

The kernel doesn’t need all physical memory mapped at all times. In fact, in more typical systems with a 3G/1G split (1GiB for the kernel) and more than 1GiB of RAM, that’s impossible. So the kernel supports a concept of high memory which is not directly mapped. It can be instructed to reserve more space for high memory mappings with the vmalloc commandline option. Let’s give the kernel only 64 MiB of directly mapped physical memory. This means 3072 MiB - 64 MiB = 3008 MiB of vmalloc space. Subtract another 16 MiB for overhead, leaving 2992 MiB.

Adding vmalloc=2992M to the kernel commandline yields:

320MB HIGHMEM available.
63MB LOWMEM available.
[...]
found a suitable region at 0x18800000 of size 0xb8553000 (3092590592).  Our size request is 0xcd000000 (3439329280)
sPhysicalBankStart = 0x18800000, physMemSize = 0xb7b53000 (3082104832), sIORemapBase = 0x47000000
47000000..feb53000 is ioremapped memory, 3082104832 bytes, physicalBankStart @ 0x18800000

2939 MiB of the 3072 MiB kernel virtual address space is used for the synthesizer heap, and the UI now reports a whopping 2277 MiB of user sample memory available (after subtracting other synth memory usage).

2277 MiB. Count them.

This is as far as we can go. Pushing the kernel direct map lower might work, but that gives diminishing returns (at most 64 MiB more), and we’re already running into the 3.26 GiB available from the BIOS as low memory already (remember that userspace takes up 384 MiB of that). To get as much as 512 MiB more of sample memory, we’d need a custom kernel with a nonstandard 0.5G/3.5G virtual address space split, PAE enabled to use high memory for userspace, and several patches to OA.ko to handle that—but the patching required may or may not be feasible, since PAE will likely cause binary module compatibility problems. There are a few small improvements that could be made (right now the kernel wastes 8 MiB of RAM in overlapped PCI address space), but they are not worth the effort to fix. For now, this is as good as it’s going to get.

Things left to do

The upgraded Kronos works great, but there are a few issues that I’d like to look at.

There is at least one Combi that has issues: I-A010 Phantasies crashes the whole Kronos if the polyphony climbs too high (this is very easy with that patch even when playing normally, no need to mash keys). This doesn’t happen with the original CostProfile data, which limits polyphony to avoid overloading the CPU. My guess is that this is a synthesizer bug that is simply never triggered on the original hardware because the voice stealer kicks in and drops voices much earlier. At least restoring the CostProfile file works around this (though it limits effective synth performance to that of the original hardware), and switching that file around is easy, so nothing major is lost.

I suspect that the CPU pinning may be wrong on the new hardware. Both the old and the new CPU are dual core hyperthreaded CPUs, but the order of the CPUs is different: on the Atom, CPUs 0,1 and 2,3 are the thread pairs, while on the Skylake they are 0,2 and 1,3. This might mean that a single core ends up effectively carrying both the synth and effects load on two threads. Then again, the new CPU is so much more massively powerful that this may not matter…

So far I’ve been using kronoshacker’s precompiled kernel, since it seems to work well, but I do intend to compile my own and put up my own patchset branch on github when I get a chance.

2016-06-01 06:40