Skip to content

Commit e88ed22

Browse files
Daniel Bristot de Oliveirarostedt
Daniel Bristot de Oliveira
authored andcommitted
tracing/timerlat: Add user-space interface
Going a step further, we propose a way to use any user-space workload as the task waiting for the timerlat timer. This is done via a per-CPU file named osnoise/cpu$id/timerlat_fd file. The tracef_fd allows a task to open at a time. When a task reads the file, the timerlat timer is armed for future osnoise/timerlat_period_us time. When the timer fires, it prints the IRQ latency and wakes up the user-space thread waiting in the timerlat_fd. The thread then starts to run, executes the timerlat measurement, prints the thread scheduling latency and returns to user-space. When the thread rereads the timerlat_fd, the tracer will print the user-ret(urn) latency, which is an additional metric. This additional metric is also traced by the tracer and can be used, for example of measuring the context switch overhead from kernel-to-user and user-to-kernel, or the response time for an arbitrary execution in user-space. The tracer supports one thread per CPU, the thread must be pinned to the CPU, and it cannot migrate while holding the timerlat_fd. The reason is that the tracer is per CPU (nothing prohibits the tracer from allowing migrations in the future). The tracer monitors the migration of the thread and disables the tracer if detected. The timerlat_fd is only available for opening/reading when timerlat tracer is enabled, and NO_OSNOISE_WORKLOAD is set. The simplest way to activate this feature from user-space is: -------------------------------- %< ----------------------------------- int main(void) { char buffer[1024]; int timerlat_fd; int retval; long cpu = 0; /* place in CPU 0 */ cpu_set_t set; CPU_ZERO(&set); CPU_SET(cpu, &set); if (sched_setaffinity(gettid(), sizeof(set), &set) == -1) return 1; snprintf(buffer, sizeof(buffer), "/sys/kernel/tracing/osnoise/per_cpu/cpu%ld/timerlat_fd", cpu); timerlat_fd = open(buffer, O_RDONLY); if (timerlat_fd < 0) { printf("error opening %s: %s\n", buffer, strerror(errno)); exit(1); } for (;;) { retval = read(timerlat_fd, buffer, 1024); if (retval < 0) break; } close(timerlat_fd); exit(0); } -------------------------------- >% ----------------------------------- When disabling timerlat, if there is a workload holding the timerlat_fd, the SIGKILL will be sent to the thread. Link: https://lkml.kernel.org/r/69fe66a863d2792ff4c3a149bf9e32e26468bb3a.1686063934.git.bristot@kernel.org Cc: Juri Lelli <juri.lelli@redhat.com> Cc: William White <chwhite@redhat.com> Cc: Daniel Bristot de Oliveira <bristot@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
1 parent cb7ca87 commit e88ed22

File tree

3 files changed

+455
-5
lines changed

3 files changed

+455
-5
lines changed

Documentation/trace/timerlat-tracer.rst

+78
Original file line numberDiff line numberDiff line change
@@ -180,3 +180,81 @@ dummy_load_1ms_pd_init, which had the following code (on purpose)::
180180
return 0;
181181

182182
}
183+
184+
User-space interface
185+
---------------------------
186+
187+
Timerlat allows user-space threads to use timerlat infra-structure to
188+
measure scheduling latency. This interface is accessible via a per-CPU
189+
file descriptor inside $tracing_dir/osnoise/per_cpu/cpu$ID/timerlat_fd.
190+
191+
This interface is accessible under the following conditions:
192+
193+
- timerlat tracer is enable
194+
- osnoise workload option is set to NO_OSNOISE_WORKLOAD
195+
- The user-space thread is affined to a single processor
196+
- The thread opens the file associated with its single processor
197+
- Only one thread can access the file at a time
198+
199+
The open() syscall will fail if any of these conditions are not met.
200+
After opening the file descriptor, the user space can read from it.
201+
202+
The read() system call will run a timerlat code that will arm the
203+
timer in the future and wait for it as the regular kernel thread does.
204+
205+
When the timer IRQ fires, the timerlat IRQ will execute, report the
206+
IRQ latency and wake up the thread waiting in the read. The thread will be
207+
scheduled and report the thread latency via tracer - as for the kernel
208+
thread.
209+
210+
The difference from the in-kernel timerlat is that, instead of re-arming
211+
the timer, timerlat will return to the read() system call. At this point,
212+
the user can run any code.
213+
214+
If the application rereads the file timerlat file descriptor, the tracer
215+
will report the return from user-space latency, which is the total
216+
latency. If this is the end of the work, it can be interpreted as the
217+
response time for the request.
218+
219+
After reporting the total latency, timerlat will restart the cycle, arm
220+
a timer, and go to sleep for the following activation.
221+
222+
If at any time one of the conditions is broken, e.g., the thread migrates
223+
while in user space, or the timerlat tracer is disabled, the SIG_KILL
224+
signal will be sent to the user-space thread.
225+
226+
Here is an basic example of user-space code for timerlat::
227+
228+
int main(void)
229+
{
230+
char buffer[1024];
231+
int timerlat_fd;
232+
int retval;
233+
long cpu = 0; /* place in CPU 0 */
234+
cpu_set_t set;
235+
236+
CPU_ZERO(&set);
237+
CPU_SET(cpu, &set);
238+
239+
if (sched_setaffinity(gettid(), sizeof(set), &set) == -1)
240+
return 1;
241+
242+
snprintf(buffer, sizeof(buffer),
243+
"/sys/kernel/tracing/osnoise/per_cpu/cpu%ld/timerlat_fd",
244+
cpu);
245+
246+
timerlat_fd = open(buffer, O_RDONLY);
247+
if (timerlat_fd < 0) {
248+
printf("error opening %s: %s\n", buffer, strerror(errno));
249+
exit(1);
250+
}
251+
252+
for (;;) {
253+
retval = read(timerlat_fd, buffer, 1024);
254+
if (retval < 0)
255+
break;
256+
}
257+
258+
close(timerlat_fd);
259+
exit(0);
260+
}

0 commit comments

Comments
 (0)