Skip to content

Instantly share code, notes, and snippets.

@nickodell
Last active October 18, 2025 19:51
Show Gist options
  • Select an option

  • Save nickodell/35fd8b25d1fceb4d6d75ef7386a0cdaf to your computer and use it in GitHub Desktop.

Select an option

Save nickodell/35fd8b25d1fceb4d6d75ef7386a0cdaf to your computer and use it in GitHub Desktop.
scipy docs troubleshooting

Docs Performance Investigation Plan

Date: 2025-10-16

The following are a set of suggested troubleshooting ideas for diagnosing the problem with the docs.scipy.org server.

(Note: The following instructions assume that you are running Ubuntu 20.04 LTS, which is my best guess for what the server is running based on its Apache version / server banner.)

Linux Performance in 60000 ms

The first set of tools I would suggest is the tools suggested by Brendan Gregg in his Linux Performance in 60000 ms article.

uptime
dmesg | tail
vmstat 1 --unit M
mpstat -P ALL 1
pidstat 1
iostat -xz 1
free -m
sar -n DEV 1
sar -n TCP,ETCP 1
top

Nine of these ten tools can be run without root access. The article I linked explains how to interpret the output of each of these tools.

In addition, I would also suggest the following tool. This tool can show the total number of TCP connections, which may be helpful in identifying problems driven by the web server running out of connections.

ss -s

Each of these tools can be made into a monitoring tool by writing a Bash script which runs it in a loop, sleeps, and redirects the output to a file.

sysstat

It's possible that whatever is going wrong is only occurring 1% of the time, and 99% of the time the server is perfectly fine.

sysstat is very useful for diagnosing problems that only happen intermittently. In addition to the ad-hoc mode used by Brendan Gregg above, it can also be set up to log information every 10 minutes on CPU, memory, etc. This is configured by editing the file /etc/default/sysstat. The information can be viewed with sar.

This could be used for two purposes:

  1. It could show that the server is underpowered for its load.
  2. It could be combined with a monitoring tool like UptimeRobot to identify a correlation between e.g. CPU use and the server being nonresponsive.

Too much load for given resources

The above suggestions focus on the idea that there might be a bottleneck or misconfiguration present on the server, and focus on how to identify this. However, it is also possible that it is configured correctly, but is just not powerful enough to handle the current load. For example, there might be poorly-written scrapers making too many requests to the docs server.

I would make two suggestions. First, identify if this is the case by making a count of requests since the log was last rotated.

wc -l /var/log/apache2/access.log

If this number is larger than your hardware can reasonably support, I have two suggestions.

  1. If budget allows, one option would be moving to a larger VM.

  2. Add some caching layer in front of the VM.

    You could evaluate how useful this would be by looking at the percentage of page URLs that are repeated within an N minute window. That would let you establish whether caching would be useful. (My guess is that it would be, but on the other hand SciPy offers versioned access to the docs, which might increase distinct number of URLs too much to help.)

    One option for a caching layer would be Cloudflare. Cloudflare also offers free credits to open-source projects, which would be a helpful way to accomplish this on a budget. As I understand it, SciPy is already using Cloudflare for DNS hosting.

Tuning Apache

I have a few suggestions for tuning the Apache apache server for better performance.

  1. Enabling MPM event. (This is probably already enabled, as it's the default in Ubuntu 20.04, but if you ported an Apache configuration from a previous host, you might be using prefork.)
  2. Tuning the worker count to avoid running out of either connections or memory. See also Apache Performance Tuning.
  3. Tuning KeepAliveTimeout to reduce the number of concurrent connections.

Benchmarking

If I knew how this server were configured, (CPU/memory/software versions/apache config) I could duplicate the configuration onto a VM running on another environment, so that I can benchmark that configuration and similar ones using ApacheBench, and give you more specific advice.

One other option that is worth considering is Nginx. I have personally found it to be faster at serving static files and proxying traffic. I could benchmark it and your current configuration to see whether it is worth switching. The practicality of this option would depend on the complexity of your Apache configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment