• Please review our updated Terms and Rules here

Advice on tuning unix process scheduler..

Elektraglide

Experienced Member
Joined
Mar 24, 2023
Messages
265
Location
UK
It was a long time ago (in a galaxy far away?) but I recall when working on Hazeltine terminals connected to a PDP 11/70 as being *mostly* responsive as you scrolled through source code etc. There were absolutely days when it chugged, but I *think* I remember it being OK... (Am I misremembering?)

Anyway, the scheduler on the Tektronix 4404.. not so much. If you are compiling C on 1 terminal, other terminal sessions are quite unresponsive... sooo, I was wondering 'How Hard Can It Be' to tune the settings the scheduler uses to decide who gets run next.

There are 4 profiles (weirdly called Personalities) used for tasks that are used to calculate their score to determine who gets run:

sched_personalities.png


I've labelled the fields on the right based on walking thru the scheduler machine code. My question is, when you have many knobs to turn, where do you begin in figuring out what control helps and what control hinder?
I realise that's a bit vague but I'm looking for the suggestions as to where to start. I have of course tried just fiddling ucpu limit on CPU and DISK but its not super obvious what the effect is..

I guess I'm thinking the DISK_personality values are based on a spinning rust disks and perhaps a SCSI emulator has a different profile?
 
IMO the place to start is to understand why the system is unresponsive. Is the system thrashing because there is not enough memory? I do not know anything about the Tek 4404, but usually a unix-like system will do OK giving all the users enough CPU time even if one process is running compute-bound. Memory scarcity may be another story. Disk-bound may be a possibility, too, although this may again come down to memory due to disk buffering.
 
Yes, good advice. AFAIK There is no swapping going on here, so one less thing to consider.

Digging in and single stepping through the kernel assembler, I have a better picture of how the scheduler works:

A ready task always starts as CPU_personality.
If it is swapping it is made DISK_personality.
If it performs read/write it is made PIPE_personality.
If it performs tty it is made TTY_personality.

So all pretty obvious I guess. :-)

I altered my top utility I wrote last year to show raw priorities of each task (signed byte: -127 => +128 + a priority bias).

Task priority is evolved using the formula:

((-90-taskbias) - ucpu/4) - (usys_ratio/16)

User cpu (ucpu) reduces priority over time along with a boost if lots (16:1) of system time is used. Makes sense that long running stuff
gets pushed down the Q and using more system resources (ratio of system time / user time) increases the speed that happens.

Will dig in some more!
 
Graphing the task priority for CPU, TTY, DISK, PIPE gives some insight into how ucpu delta/limit are used (red,green,blue,magenta for respective personalities)

Screenshot 2026-04-15 at 19.27.43.png

The ucpu delta controls the gradient but the effect is CPU,TTY,DISK are all about the same priority of ~ -90.

The other dynamically calculated value per task is the quantum of work before being preempted which uses what I dumbly labelled scheduler delta / scheduler limit. Here you can see the DISK personality ramps up the quantum over time (blue) whereas the CPU / TTY (red / green) ramp up fast to a lower quantum.

Screenshot 2026-04-15 at 19.28.12.png

The rationale for making PIPE personality very niced but with a very large quantum is a bit beyond me ATM..

But I'm beginning to form some ideas; I would have expected a CPU profile to be setup so it smoothly gets lower and lower priority over time to be a Good Citizen, no?

And my guess would be making CPU much shorter quantum would make for a smoother feeling system. If a CPU heavy task is uncontended, its going to get the resource anyway so dont just grab it regardless...
 

Attachments

  • Screenshot 2026-04-15 at 19.28.12.png
    Screenshot 2026-04-15 at 19.28.12.png
    217.3 KB · Views: 1
Well, if your hypothesis for how something works doesn't match reality, its probably wrong! After posting about the task priority and commenting I would expect a long running task to be handled more dynamically, I looked at the code again and had totally misread how the personality "divider" setting is used. The code is this:

Screenshot 2026-04-16 at 10.44.18.png

Which I had read as: get the task ucpu value and divide it by the personality divider/scaling at (0x8,A1) and save it back to ucpu. Looking again, its taking the MOD of that - big difference - because now we get a sawtooth shape that slowly deprioritizes a task but gives it a periodic boost back up with the period of this getting longer and longer.


Screenshot 2026-04-16 at 10.43.29.png

[scheduler applies a hard floor of -127 to priority which is what PIPE is hitting]


To @Uniballer's question (a most excellent handle I might add!) of get to root cause as to WHY the system is unresponsive. It is about scheduling but more tricky to find a smoking gun as the whole system is unresponsive.. Now I understand the scheduler a bit I am going to try and tweak my version of top to ensure it ALWAYS gets run and never starved and see what I see..
 
Last edited:
Using my new found understanding of the Uniflex scheduler, I added a little helper that allows boosting a process beyond what nice() can do by using the "task priority bias" field.
For certain things like having top be able to show me whats going on when there are CPU heavy processes or the window manager being able to actually repaint when there are resource heavy processes running - its night and day. I do remember many years ago having a "Huh" moment when reading Apple have their window manager run at far higher prioirity than anything else to ensure its always smooth.

 
Oof, bit of mission creep but watching the video above and the window dragged around all blank made me whince a bit...
I remember being knocked out by watching Sun Microsystems NeWS drag the entire window contents when you moved it rather than the normal wireframe box..

So I added some BitBlt-ing as you drag so you can see the contents of the window (made possible because the wmgr is no longer starved of CPU).

 
I've updated the Tektronix 4404 emulator with the latest disk image with all these changes if you want to give it a spin yourself and follow along.

Runs on Ubuntu / Mac / Windows.

https://elektraglide.com

tek4404.jpg
 
How often does the scheduler run that division? Afaik division takes about 100 cycles!

Also: How does I/O work? If keyboard/serial port input is polled or if the interrupt code is slow, it might contribute to things feeling sluggish.
 
How does I/O work? If keyboard/serial port input is polled or if the interrupt code is slow, it might contribute to things feeling sluggish.

The I/O is all interrupt driven - I dont think thats the problem.

A quantum of 150 means 15 slices as the quantum is actually implemented in units of 10 ticks. Raising the priority of stuff like the window manager does make a nicer experience but still if you are compiling in a terminal, the telnet daemon is completely starved and you cant login. I think reducing the quantum of DISK would help.

How often does the scheduler run that division? Afaik division takes about 100 cycles!
back in the day if my team wrote code like that I would have had some harsh words! The thing is littered with redundant divu.w! Divide by 1, divide by 2 etc.. And stuff like the scheduler bookkeeps in 1/100th of second ticks. But all scheduling is done in units of 1/10th second, so EVERY scheduling function reads a number and divu #10. Argggggggghhhhh.

I am thinking of running a patching script that finds and fixes pointless divu/divs and see if its measurably different.

I guess a quick hack would be to tweak MAME emulation of divu/divs to be 4 cycles or something... hmm, that might be easy to do..
 
That's vintage C compilers for you, I guess? Could be another reason why Tek did an internal BSD port.
Meh, we ALL used >> for power of 2 divides back in the day. Because we all understood that C compilers - at that time - were all a bit shit at emitting decent code. When gcc arrived it was such a big deal that it could actually spot useful optimizations rather than the lame-oh-so-too-late "peep hole optimizers" :-)
 
How often does the timer tick run? I tried running NetBSD on a Mac SE/30 about forever ago and had to reduce the timer interrupt from 100 Hz to 10 Hz just so the initial boot after install would re-index the manpages in my lifetime. At 16 MHz, a 100 Hz timer interrupt just couldn't let the 68030 get out of its own way to do anything useful.
 
How often does the timer tick run? I tried running NetBSD on a Mac SE/30 about forever ago and had to reduce the timer interrupt from 100 Hz to 10 Hz just so the initial boot after install would re-index the manpages in my lifetime. At 16 MHz, a 100 Hz timer interrupt just couldn't let the 68030 get out of its own way to do anything useful.
Yes, confirmed scheduler tick is 0.1s (6 vblanks). And the reason it adds #10 to bookkeeping values each time to have it in units of 1/100th second 🤪 .

Damn, the more I go down this rabbit hole.. trying to find a clearer way of seeing when tasks are retired so I can start reliably testing changing Personailties etc.
 
More reverse engineering delights. Every task structure has a tsmode bitfield representing:

Code:
#define TCORE  0x01   /*  task is in core  */
#define TLOCK  0x02   /*  task is locked in core  */
#define TSYSTM 0x04   /*  task is system scheduler  */
#define TTRACP 0x08   /*  task is being traced  */
#define TSWAPO 0x10   /*  task is being swapped  */
#define TARGX  0x20   /*  task is in argument expansion  */

There is also a tsmode2 bitfield that is entire undocumented. However, re-reading the scheduler assembler for the Nth time, and cross checking with 2 or 3 other places in the kernel that reference tsmode2, it appears the bits represent:

Code:
/* task mode2 codes undocumented */
#define TUNKNOWN 0x01  /* single stepping?  */
#define TFROZEN 0x08
#define TFIXEDPRIORITY 0x40
#define TDETACHEDTERMINAL 0x80

You can see this in the scheduler that it simply skips all the task weighting if TFIXEDPRIORITY is set:

Screenshot 2026-04-23 at 16.43.19.png


And some quick tests confirm setting sometask.tsmode2 |= TFIXEDPRIORITY does make it have a constant priority based solely on -90 - taskprioritybias. Cool!
Unfortunately, it looks like it is inherited by child processes so you can quickly bork the system when everyone has top priority! Doh.
 
If you ever feel that Uniflex is too broken, check out HP-UX on Integral PC (also m68k, different custom MMU, accelerated graphics) -- should work fine in mainline MAME.
 
Funny. Sure, I could install ubuntu on an old laptop and it would all work..

But - as you know - that is not the point! And perhaps it gets to the heart of this. I love programming. I love debugging.
I love figuring out how stuff works. The journey IS the enjoyment, not the destination. Its the same as people doing
crosswords for enjoyment. There is no "POINT". Its just a fun distraction.

I do think its very sad that the roaring passion with which we pursued software dev in the past seems to have been lost.. But hey ho.
 
Back
Top