Skip to content

Commit a1cba8e

Browse files
AKASHI Takahirojwessel
AKASHI Takahiro
authored andcommitted
arm64: kgdb: fix single stepping
Jason, Could you please review my patch below? See also arm64 maintainer's comment: http://lists.infradead.org/pipermail/linux-arm-kernel/2015-January/313712.html Thanks, -Takahiro AKASHI I tried to verify kgdb in vanilla kernel on fast model, but it seems that the single stepping with kgdb doesn't work correctly since its first appearance at v3.15. On v3.15, 'stepi' command after breaking the kernel at some breakpoint steps forward to the next instruction, but the succeeding 'stepi' never goes beyond that. On v3.16, 'stepi' moves forward and stops at the next instruction just after enable_dbg in el1_dbg, and never goes beyond that. This variance of behavior seems to come in with the following patch in v3.16: commit 2a28307 ("arm64: debug: avoid accessing mdscr_el1 on fault paths where possible") This patch (1) moves kgdb_disable_single_step() from 'c' command handling to single step handler. This makes sure that single stepping gets effective at every 's' command. Please note that, under the current implementation, single step bit in spsr, which is cleared by the first single stepping, will not be set again for the consecutive 's' commands because single step bit in mdscr is still kept on (that is, kernel_active_single_step() in kgdb_arch_handle_exception() is true). (2) re-implements kgdb_roundup_cpus() because the current implementation enabled interrupts naively. See below. (3) removes 'enable_dbg' in el1_dbg. Single step bit in mdscr is turned on in do_handle_exception()-> kgdb_handle_expection() before returning to debugged context, and if debug exception is enabled in el1_dbg, we will see unexpected single- stepping in el1_dbg. Since v3.18, the following patch does the same: commit 1059c6b ("arm64: debug: don't re-enable debug exceptions on return from el1_dbg) (4) masks interrupts while single-stepping one instruction. If an interrupt is caught during processing a single-stepping, debug exception is unintentionally enabled by el1_irq's 'enable_dbg' before returning to debugged context. Thus, like in (2), we will see unexpected single-stepping in el1_irq. Basically (1) and (2) are for v3.15, (3) and (4) for v3.1[67]. * issue fixed by (2): Without (2), we would see another problem if a breakpoint is set at interrupt-sensible places, like gic_handle_irq(): KGDB: re-enter error: breakpoint removed ffffffc000081258 ------------[ cut here ]------------ WARNING: CPU: 0 PID: 650 at kernel/debug/debug_core.c:435 kgdb_handle_exception+0x1dc/0x1f4() Modules linked in: CPU: 0 PID: 650 Comm: sh Not tainted 3.17.0-rc2+ torvalds#177 Call trace: [<ffffffc000087fac>] dump_backtrace+0x0/0x130 [<ffffffc0000880ec>] show_stack+0x10/0x1c [<ffffffc0004d683c>] dump_stack+0x74/0xb8 [<ffffffc0000ab824>] warn_slowpath_common+0x8c/0xb4 [<ffffffc0000ab90c>] warn_slowpath_null+0x14/0x20 [<ffffffc000121bfc>] kgdb_handle_exception+0x1d8/0x1f4 [<ffffffc000092ffc>] kgdb_brk_fn+0x18/0x28 [<ffffffc0000821c8>] brk_handler+0x9c/0xe8 [<ffffffc0000811e8>] do_debug_exception+0x3c/0xac Exception stack(0xffffffc07e027650 to 0xffffffc07e027770) ... [<ffffffc000083cac>] el1_dbg+0x14/0x68 [<ffffffc00012178c>] kgdb_cpu_enter+0x464/0x5c0 [<ffffffc000121bb4>] kgdb_handle_exception+0x190/0x1f4 [<ffffffc000092ffc>] kgdb_brk_fn+0x18/0x28 [<ffffffc0000821c8>] brk_handler+0x9c/0xe8 [<ffffffc0000811e8>] do_debug_exception+0x3c/0xac Exception stack(0xffffffc07e027ac0 to 0xffffffc07e027be0) ... [<ffffffc000083cac>] el1_dbg+0x14/0x68 [<ffffffc00032e4b4>] __handle_sysrq+0x11c/0x190 [<ffffffc00032e93c>] write_sysrq_trigger+0x4c/0x60 [<ffffffc0001e7d58>] proc_reg_write+0x54/0x84 [<ffffffc000192fa4>] vfs_write+0x98/0x1c8 [<ffffffc0001939b0>] SyS_write+0x40/0xa0 Once some interrupt occurs, a breakpoint at gic_handle_irq() triggers kgdb. Kgdb then calls kgdb_roundup_cpus() to sync with other cpus. Current kgdb_roundup_cpus() unmasks interrupts temporarily to use smp_call_function(). This eventually allows another interrupt to occur and likely results in hitting a breakpoint at gic_handle_irq() again since debug exception is always enabled in el1_irq. We can avoid this issue by specifying "nokgdbroundup" in kernel parameter, but this will also leave other cpus be in unknown state in terms of kgdb, and may result in interfering with kgdb activity. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
1 parent 7a6653f commit a1cba8e

File tree

1 file changed

+46
-14
lines changed

1 file changed

+46
-14
lines changed

arch/arm64/kernel/kgdb.c

+46-14
Original file line numberDiff line numberDiff line change
@@ -19,10 +19,14 @@
1919
* along with this program. If not, see <http://www.gnu.org/licenses/>.
2020
*/
2121

22+
#include <linux/cpumask.h>
2223
#include <linux/irq.h>
24+
#include <linux/irq_work.h>
2325
#include <linux/kdebug.h>
2426
#include <linux/kgdb.h>
2527
#include <linux/kprobes.h>
28+
#include <linux/percpu.h>
29+
#include <asm/ptrace.h>
2630
#include <asm/traps.h>
2731

2832
struct dbg_reg_def_t dbg_reg_def[DBG_MAX_REG_NUM] = {
@@ -106,6 +110,9 @@ struct dbg_reg_def_t dbg_reg_def[DBG_MAX_REG_NUM] = {
106110
{ "fpcr", 4, -1 },
107111
};
108112

113+
static DEFINE_PER_CPU(unsigned int, kgdb_pstate);
114+
static DEFINE_PER_CPU(struct irq_work, kgdb_irq_work);
115+
109116
char *dbg_get_reg(int regno, void *mem, struct pt_regs *regs)
110117
{
111118
if (regno >= DBG_MAX_REG_NUM || regno < 0)
@@ -189,18 +196,14 @@ int kgdb_arch_handle_exception(int exception_vector, int signo,
189196
* over and over again.
190197
*/
191198
kgdb_arch_update_addr(linux_regs, remcom_in_buffer);
192-
atomic_set(&kgdb_cpu_doing_single_step, -1);
193-
kgdb_single_step = 0;
194-
195-
/*
196-
* Received continue command, disable single step
197-
*/
198-
if (kernel_active_single_step())
199-
kernel_disable_single_step();
200199

201200
err = 0;
202201
break;
203202
case 's':
203+
/* mask interrupts while single stepping */
204+
__this_cpu_write(kgdb_pstate, linux_regs->pstate);
205+
linux_regs->pstate |= PSR_I_BIT;
206+
204207
/*
205208
* Update step address value with address passed
206209
* with step packet.
@@ -211,8 +214,6 @@ int kgdb_arch_handle_exception(int exception_vector, int signo,
211214
*/
212215
kgdb_arch_update_addr(linux_regs, remcom_in_buffer);
213216
atomic_set(&kgdb_cpu_doing_single_step, raw_smp_processor_id());
214-
kgdb_single_step = 1;
215-
216217
/*
217218
* Enable single step handling
218219
*/
@@ -244,6 +245,18 @@ NOKPROBE_SYMBOL(kgdb_compiled_brk_fn);
244245

245246
static int kgdb_step_brk_fn(struct pt_regs *regs, unsigned int esr)
246247
{
248+
unsigned int pstate;
249+
250+
kernel_disable_single_step();
251+
atomic_set(&kgdb_cpu_doing_single_step, -1);
252+
253+
/* restore interrupt mask status */
254+
pstate = __this_cpu_read(kgdb_pstate);
255+
if (pstate & PSR_I_BIT)
256+
regs->pstate |= PSR_I_BIT;
257+
else
258+
regs->pstate &= ~PSR_I_BIT;
259+
247260
kgdb_handle_exception(1, SIGTRAP, 0, regs);
248261
return 0;
249262
}
@@ -265,16 +278,27 @@ static struct step_hook kgdb_step_hook = {
265278
.fn = kgdb_step_brk_fn
266279
};
267280

268-
static void kgdb_call_nmi_hook(void *ignored)
281+
static void kgdb_roundup_hook(struct irq_work *work)
269282
{
270283
kgdb_nmicallback(raw_smp_processor_id(), get_irq_regs());
271284
}
272285

273286
void kgdb_roundup_cpus(unsigned long flags)
274287
{
275-
local_irq_enable();
276-
smp_call_function(kgdb_call_nmi_hook, NULL, 0);
277-
local_irq_disable();
288+
int cpu;
289+
struct cpumask mask;
290+
struct irq_work *work;
291+
292+
mask = *cpu_online_mask;
293+
cpumask_clear_cpu(smp_processor_id(), &mask);
294+
cpu = cpumask_first(&mask);
295+
if (cpu >= nr_cpu_ids)
296+
return;
297+
298+
for_each_cpu(cpu, &mask) {
299+
work = per_cpu_ptr(&kgdb_irq_work, cpu);
300+
irq_work_queue_on(work, cpu);
301+
}
278302
}
279303

280304
static int __kgdb_notify(struct die_args *args, unsigned long cmd)
@@ -315,13 +339,21 @@ static struct notifier_block kgdb_notifier = {
315339
int kgdb_arch_init(void)
316340
{
317341
int ret = register_die_notifier(&kgdb_notifier);
342+
int cpu;
343+
struct irq_work *work;
318344

319345
if (ret != 0)
320346
return ret;
321347

322348
register_break_hook(&kgdb_brkpt_hook);
323349
register_break_hook(&kgdb_compiled_brkpt_hook);
324350
register_step_hook(&kgdb_step_hook);
351+
352+
for_each_possible_cpu(cpu) {
353+
work = per_cpu_ptr(&kgdb_irq_work, cpu);
354+
init_irq_work(work, kgdb_roundup_hook);
355+
}
356+
325357
return 0;
326358
}
327359

0 commit comments

Comments
 (0)