Category Archives: Linux

Use sudo without password

When you do experiments in your own computer or lab, security is not a big issue, but every time you use sudo commmand, the password is needed; or if you want to run a script which would run remote sudo command in remote computer, under that circumstance, the script cannot be runned automatically since user would be promoted to input the password, which is quite inconvenient. So let’s make your life easier.

Do following things then you can run sudo command without password:

sudo visudo

insert this line to the file

ALL ALL = (ALL) NOPASSWD: ALL

That’s it!

 

Advertisements

Understanding the cpu id in multi-core system with hyperthreading

When you do experiment in multi-core system with Hyperthreading enabled, you may wish to set the cpu affinity to get different settings. Then, the first thing you need to know is the meaning of cpu id (here cpu id is actually the id of logic core) because you need to know which logical core are in the same cpu or which logic core are sharing the same physical core. We can get all these infomation from the /proc/cpuinfo. But we only need some relevant infomation to understand the cpu id, they are processor #, core id and physical id.

The processor # is the cpu id, each logic core has an unique processor #. The core id is the physical core id, each physical core has an unique core id. The logic cores that have same core id would share the same physical core. The physical id refer to the socket id, each socket has an unique physical id. The logic cores that have same physical id would be in the same cpu.

The command to retrive these data is: egrep “(( id|processo).*:|^ *$)” /proc/cpuinfo. Following is the sample output:

processor: 0
physical id: 0
core id: 0

processor: 1
physical id: 1
core id: 0

processor: 2
physical id: 0
core id: 1

Based on these infomation, you can understand the cpu id numbering. I draw a figure to show the relationship of cpu id in my system which use Intel(R) Xeon(R) CPU E5-2665: two sockets(cpus), each has 8 cores, each core can emulate 2 logic cores.

cpuid

 

BTW, when you want to check each logic core’s performance like cpu usage, you can easily use top command, after type top you then press 1, here is a example:

top

Reference:
http://stackoverflow.com/questions/3019129/cpu-ordering-in-linux-with-hyper-threading
http://www.richweb.com/cpu_info

Hijack the divide_error (Interrupt 0) exception handler

My technique for hijacking the divide_error interrupt 0 follows these steps:

1. Obtain original IDT pointer from a specific register to retrieve the address and size of original IDT.
2. Create new IDT by allocating one page and memory copy from original IDT.
3. In new IDT, modify the divide_error entry with the new address that point to my assembly handler.
4. My assembly handler would call my C handler first and then just simply jump to the original assembly handler.
5. The address of original assembly handler is obtained in System.map and is hardcoded in the source code.
6. Create new IDT pointer based on new IDT.
7. Active new IDT by loading new IDT pointer to that specific register.
8. Recover by loading original IDT pointer to that specific register.

All these steps are implemented in my module called hook. Loading my module would hijack the IDT and removing my module would recover the IDT and report the total number of divide error interrupts handled during hijacking. And my C handler would just simply maintain and print out a counter each time it is invoked. I write a C program called float to generate divide error interrupt. It looks like:

int a,b;
a = 1;
b = 0;
printf(“%d\n”,a/b);

All source code could be found in Appendix at last. Moreover, the following snapshot would demonstrate my work, which is tested in Linux kernel 3.12.6.

hijacking

Appendix:

File: hook.c

#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/tty.h>
#include <linux/sched.h>
#include <linux/mm.h>
#include <linux/slab.h>
#include <asm/desc.h>

#define DIVIDE_ERROR 0x00

/* global varaible */
char msg[200];
char *str = "test";
struct desc_ptr newidtr, oldidtr,tempidtr;
gate_desc *newidt, *oldidt, *tempidt;
int counter = 0;
unsigned long old_stub = 0xc15d9c64;
struct tty_struct *my_tty;

/* global function */
extern asmlinkage void new_stub(void);

/* print message to console */
int write_console (char *str)
{
         struct tty_struct *my_tty;
         if((my_tty=current->signal->tty) != NULL)
         {
                ((my_tty->driver->ops->write) (my_tty,str,strlen(str)));
                return 0;
         }
         else return -1;
}

/* active idt_table by loading new idt pointer to the register */
static void load_IDTR(void *addr)
{
	asm volatile("lidt %0"::"m"(*(unsigned short *)addr));
}

/* my C handler */
void my_func(void)
{
	/* add the counter and send messge to console */
	sprintf(msg, "Counter = %d \r\n", ++counter);
     ((my_tty->driver->ops->write)(my_tty,msg,strlen(msg)));
}

/* my Assembly handler */
void my_dummy(void)
{
        __asm__ (
        ".globl new_stub    \n\t"
        ".align 4, 0x90     \n\t"
        "new_stub:	    \n\t"
        "pushfl	            \n\t"
        "pushal	            \n\t"
        "call my_func 	    \n\t"
        "popal	            \n\t"
        "popfl	            \n\t"
        "jmp *old_stub      \n\t"
         ::);
}
 
int __init hook_init(void){

	/* message */
	write_console("Jianchen hijacked interrupt_0\r\n");

	/* initialize tty for console print */
	my_tty = current->signal->tty;

	/* create new idt_table copied from old one */
	store_idt(&oldidtr);
	oldidt = (gate_desc *)oldidtr.address;
	newidtr.address = __get_free_page(GFP_KERNEL);
	if(!newidtr.address)
		return -1;
	newidtr.size = oldidtr.size;
	newidt = (gate_desc *)newidtr.address; 
	memcpy(newidt, oldidt, oldidtr.size);

	/* modify the divide_error entry to point to my assembly handler */ 
	pack_gate(&newidt[DIVIDE_ERROR], GATE_INTERRUPT, (unsigned long)new_stub, 0, 0, __KERNEL_CS);

	/* active the new idt_table */
	load_IDTR((void *)&newidtr);

	/* for smp architecture */
     //smp_call_function(load_IDTR,(void *)&newidtr, 0);

   	return 0; 
} 
void __exit hook_exit(void){

	/* message */
	write_console("Jianchen recovered interrupt_0 \r\n");
	sprintf(msg, "Interrupt_0 handled during hijacking = %d \r\n", counter);
	write_console(msg);
	
	/* active old idt_table */
	load_IDTR(&oldidtr);

	/* for smp architecture */
     //smp_call_function(load_IDTR, (void *)&oldidtr, 0);

	/* free the allocated page for new idt_table */
	if(newidtr.address)
		free_page(newidtr.address);	
}
 
module_init(hook_init);
module_exit(hook_exit);
MODULE_LICENSE("GPL");

File: Makefile

obj-m += hook.o

all: 
	make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules
clean:
	make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean

File: float.c

#include <stdio.h>
#include <iostream>

using namespace std;

int main()
{
   	int a,b;
   	a = 1;
   	b = 0;
   	printf("%d\n",a/b);
   	return 0; 
}

references:
http://mammon.github.io/Text/linux_hooker.txt
https://ruinedsec.wordpress.com/2013/04/04/modifying-system-calls-dispatching-linux/
http://phrack.org/issues/59/4.html
http://stackoverflow.com/questions/2497919/changing-the-interrupt-descriptor-table
http://www.makelinux.net/books/lkd2/ch11lev1sec3
http://www.jamesmolloy.co.uk/tutorial_html/4.-The%20GDT%20and%20IDT.html
http://stackoverflow.com/questions/5302392/idt-table-undefined-warning-when-compiling-kernel-module

linux !g command

!g would run most recent gcc/g++ compilation command
Example:
#1: gcc test.c -o test
#2: ls
#3: !g
In the example, the !g would run #1 again. This would make thing efficient.

linux fg command

When you run a job in background, you can bring it to foreground by command fg
Example:
—————–
./test &
fg 1
—————–
Then the job “test” would be in foreground now.
fg [%job_id], job_id specifies the job that you want to run in the foreground and 1 means most recently background job.
This command is useful when you don’t want to open new terminal to issue some other command, you can now do this in same window by putting one job in the background and pulling back it after issuing another command.

My own function to print slabinfo of kmalloc_cache as the form of /proc/slabinfo

Routine prototype: void my_get_kmalloc_cache_slabinfo(void)
Definition location: source/mm/slub.c
Declaration location: source/include/linux/slab.h
Call location:
Kernel_init( )
{
……
flush_delayed_fput();
my_get_kmalloc_cache_slabinfo();
……
}
Output:
6CE210B3-C1A7-4A7B-8E41-BCD80614C786
Source code:

void my_get_kmalloc_cache_slabinfo(void)
{
    int i, j;
    struct kmem_cache *s;
    struct kmem_cache_node *n;
    struct page *pos;
    
    //statistics parameters
    unsigned long nr_active_objs, nr_objs, obj_size, objs_per_slab, pages_per_slab, nr_active_slabs, nr_slabs, nr_free;
    
    //KMALLOC_SHIFT_HIGH is 13 when using CONFIG_SLUB
    for (i = 0; i <= KMALLOC_SHIFT_HIGH; i++) {
        //initialize
        nr_active_objs    = 0;
        nr_objs           = 0;
        obj_size          = 0;
        objs_per_slab     = 0;
        pages_per_slab    = 0;
        nr_active_slabs   = 0;
        nr_slabs          = 0;
        nr_free           = 0;
        
        s = kmalloc_caches[i];
        if (!s) {
            continue;
        }
        for (j = 0; j <= MAX_NUMNODES; j++) {
            n = s->node[j];
            if (!n) {
                continue;
            }
            nr_slabs += (long)n->nr_slabs.counter;
            nr_objs += (long)n->total_objects.counter;
            //iterate node->partial to get struct page and page.nr_free
            list_for_each_entry(pos, &n->partial, lru)
            {
                nr_free += pos->objects - pos->inuse;
            }
        }
        nr_active_objs  = nr_objs - nr_free;
        obj_size        = s->object_size;//without metadata
        objs_per_slab   = nr_objs / nr_slabs;
        pages_per_slab  = 1 + obj_size * objs_per_slab / (1<<12); //pagesize is 4KB
        nr_active_slabs = nr_slabs;
        
        printk("%s -> %lu %lu %lu %lu %lu 
                           :tunables %d %d %d 
                           :slabdata %lu %lu %d\n",
                           s->name, nr_active_objs, nr_objs, obj_size,
                           objs_per_slab, pages_per_slab, 0, 0, 0, 
                           nr_active_slabs, nr_slabs, 0);
        //parameters with value 0 doesn't apply to slub allocator
        
    }
}

Implement function to print buddyinfo

Following function would print the buddyinfo, which can also be got from /proc/buddyinfo

int my_buddyinfo_show()
{
    int i;
    int j;
    struct pglist_data* node_0;
    struct zone* pzone;
    node_0 = NODE_DATA(0);
    if (node_0 == NULL)
        return 0;
    for (i = 0; i < MAX_NR_ZONES; ++i)
    {
        pzone = &node_0->node_zones[i];
        if (pzone == NULL)
            continue;
        printk("Node 0 Zone %s", pzone->name);
        if (pzone->free_area == NULL)
            continue;
        for (j = 0; j < MAX_ORDER; ++j)
        {
            printk("%5lu", pzone->free_area[j].nr_free);
        }
        printk("\n");
    }
    return 0;
}

Use virsh to access kvm guest’s console

When we create bunch of virtual machines in kvm, you would found that using GUI tool to manage them waste time. We can use virsh to access the guest’s text console efficiently.

Step 1: check whether console device has been defined

virsh ttyconsole my_vm

If the output is shown(e.g. /dev/pts/41), it indicates the Guest has a console device already. Otherwise, define one with virsh edit. Here is an example to be added inside .

<console type='pty'>
  <target port='0'/>
</console>

Step 2: configure a serial console in the guest, in order that it will accept a connection

sudo vi /etc/init/ttyS0.conf

Add the configuration:

————————————————————————
# ttyS0 – getty
#
# This service maintains a getty on ttyS0 from the point the system is
# started until it is shut down again.

start on stopped rc RUNLEVEL=[2345]
stop on runlevel [!2345]

respawn
exec /sbin/getty -L 115200 ttyS0 xterm
————————————————————————
sudo start ttyS0

the “xterm” is your guest terminal type, you can figure out in guest by run

echo $TERM

My ubuntu 12.04 LTS’s terminal type is “Linux”, so I can replace “xterm” with “linux”

Step 3: then you can enjoy the convenience with

virsh console my_vm

Link: Ubuntu Official Document

Quickly create large file in linux

Usually in linux we need to quickly create large file in which we don’t care much about the content to do some test such as disk I/O test, then we can use:

fallocate -l 10GB test.bin

this system call would create a file with size 10GB named test.bin and contain random content quickly, because it only pre-allocate the space without write anything into it so that the heavy disk I/O is avoided.

Linux boot sequence

Linux Boot Sequence:

——————————————————–

Power on the machine to initialize firmware
Pick one CPU as bootstrap processor
System is now in real mode only 1MB memory can be addressed
Use EIP and hidden offset to execute first instruction in reset vector
Jump to BIOS flash memory entry location routed by memory map on chip-set

——————————————————–

Now CPU run BIOS code
Invoke POST to test system components
Search bootable device through CMOS
Load MBR the first sector in that device to memory address 0x7c00
MBR contains primary boot loader and partition table

——————————————————–

Now CPU jump to 0x7c00 and run MBR code
First stage: search active partition and load its boot sector
This boot sector could understand Linux file system format
Use MBR’s plus boot sector’s boot loader to load another boot loader
Second stage: run the newest loaded boot loader
It read a boot configuration file e.g. grub.conf to show boot choices to user
All above boot loaders combined are called GRUB
Finally GRUB would load pick system’s image to memory
The image is split into two pieces:
small one in real-mode and compressed large one in protected-mode
GRUB could pass parameters to kernel header then jump to kernel entry point in real-mode

——————————————————–

Boot loader finished and Kernel stage start
The function flow:
start_of_setup(): basic hardware setup (arch/x86/boot/header.S)
go_to_protected_mode(): set CPU to protected mode (arch/x86/boot/main.c)
startup_32(): basic register setting (arch/x86/boot/compressed/head_32.s)
decompress_kernel(): decompress image (arch/x86/boot/compressed/misc.c)
Jump to kernel enry point in protected-mode
startup_32(): also called process 0, creat IDT and GDT, enable paging and initialize page table etc. (arch/x86/kernel/head_32.s)
start_kernel(): architecture-independent kernel start-up (init/main.c)

——————————————————–
start_kernel() do long list of initialization and then call rest_init()
rest_init(): (init/main.c)
-> kernel_thread(kernel_init): active remaining CPUs and creat first user-space process 1 to call init_post() and run /sbin/init .etc.(init/main.c)
-> kernel_thread(kthread): create kernel thread process 2 (init/main.c)
-> schedule(): context switch kick in p1 to invoke other process by checking configuration file (init/main.c)
-> cpu_idle(): when there is work to do it would be switched out (init/main.c)

resource:
http://duartes.org/gustavo/blog/post/kernel-boot-process
http://www.ibm.com/developerworks/linux/library/l-linuxboot/
http://duartes.org/gustavo/blog/post/how-computers-boot-up