Thanks!
X
Paypal
Github sponshorship
Patreon

articles

New sysinfo release (OSX performance improvements)

Hi everyone! Today, sysinfo has received a new update with nice performance improvements on OSX. As a reminder, sysinfo is crate used to get systems' information (Linux, Windows, OSX, Android and raspberry pi are supported).

This release is the last one in the performance improvements series (you can read the previous blog posts here and here) . This time I focused on OSX, and as you'll see, it's been quite the trip!

Improvements on disks retrieval

On OSX, you can get the disks list by reading the entries in "/Volumes". However, if you want to get a disk's type, you need to use OSX API which is very slow. Luckily, I discovered that I could get this information by calling it once at the beginning then just store those types.

No surprise, we got quite the improvements:

Before:
test bench_refresh_disk_lists   ... bench:     820,798 ns/iter (+/- 150,281)

After:
test bench_refresh_disk_lists   ... bench:      29,327 ns/iter (+/- 3,104)

Improvements on temperatures retrieval

Getting temperatures on OSX is tricky. You have to send an id to the IOConnectCallStructMethod function (for example "TC0P" is you want the CPU temperature) and it returns the desired information. It'll return an "sp78" formatted value which is your temperature.

But, this is where troubles start, you need to call it twice: the first time to know how much size you need to allocate to your buffer and the second time to actually get the information. And I suppose you guessed at this point that this function is very slow as well? Well, you're right!

So after some investigations, I realized that the size actually returned the first time never changed for a given id, so I stored it and only called IOConnectCallStructMethod once to get the temperature. No surprise, the computation time was divided by 2:

Before:
test bench_refresh_temperatures ... bench:     651,913 ns/iter (+/- 140,283)

After:
test bench_refresh_temperatures ... bench:     294,323 ns/iter (+/- 41,612)

Improvements on processes retrieval

This part is the one which took me the most time. I mostly struggled with permission issues. For this one, I mostly reordered calls to avoid having more than necessary. If one fails, no need to get extra information. However, what took the most time in this function was this:

Runmatch Command::new("/bin/ps")
    .arg("wwwe")
    .arg("-o")
    .arg("ppid=,command=")
    .arg(pid.to_string().as_str())
    .output()

At the time, I didn't find another way to get those two information. And currently, I still don't know how to get the parent pid. For the command, there is proc_pidpath so it's fine. A limitation I have on OSX is that I can't use a lot of functions because they require types which API are very unstable. This is highly frustrating to not be able to get what you want because you're limited by an API. I could write some C code to get it, compile it in build.rs as a dependency and call it into sysinfo, but I didn't want to force people to have a C compiler around. So in the end, I decided that some processes won't have a parent pid. Generally, these processes are run by the root user.

The last thing I changed was to divide a call which retrieved a lot (and certainly too much, hence the permission issue) into two smaller ones. So proc_pidinfo which was called with PROC_PIDTASKALLINFO is now called twice with PROC_PIDTBSDINFO and then with PROC_PIDTASKINFO. It gives almost as much information and is running surprisingly as much as fast.

Time to look at the performance improvement:

Before:
test bench_new                  ... bench:  58,645,988 ns/iter (+/- 7,263,482)
test bench_refresh_all          ... bench:   4,360,295 ns/iter (+/- 491,905)
test bench_refresh_processes    ... bench:   2,517,205 ns/iter (+/- 233,035)

After:
test bench_new                  ... bench:   4,713,851 ns/iter (+/- 1,080,986)
test bench_refresh_all          ... bench:   1,639,098 ns/iter (+/- 191,147)
test bench_refresh_processes    ... bench:     782,977 ns/iter (+/- 30,958)

Clean up

I also used this opportunity to clean up the OSX source code part. It was a mess and most of the logic was in the single file. This is now way better. :)

Conclusion

This third performance improvement was long due and is now closing this performance improvement series. Now that sysinfo has better performances, I'll be able to focus on adding more requested features like the load info. More to come later!

Posted on the 06/01/2020 at 02:30 by @GuillaumeGomez