lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Sun, 22 Jul 2007 21:32:02 -0400 From: Kyle Moffett <mrlinuxman@....com> To: Lars Ellenberg <lars@...bit.com> Cc: Jens Axboe <axboe@...nel.dk>, Andrew Morton <akpm@...l.org>, lkml <linux-kernel@...r.kernel.org>, drbd-user@...bit.com Subject: Re: [DRIVER SUBMISSION] DRBD wants to go mainline Ok, I didn't have a chance to get through anywhere near all of it, but here's my comments so far. I didn't really go through things in any particular order but most of these comments are about your drbd_int.h header file. Hopefully a lot of the comments will be useful to fix similar/identical problems in your C files. First of all, if you could break this up into chunks (even if they aren't useful individually) just to make it easier to review. Divisions like "This code does on-disk bitmap management", and "This code does network protocol encoding/decoding" would be extremely helpful when digging through this stuff. I ended up only really doing a cursory review for low-level/style issues, since without the bigger picture (and without the fixed style issues and macro cleanups) it's very hard to really give a good high-level review. Cheers, Kyle Moffett +drbd-objs := drbd_buildtag.o drbd_bitmap.o drbd_proc.o \ + drbd_worker.o drbd_receiver.o drbd_req.o drbd_actlog.o \ + lru_cache.o drbd_main.o drbd_strings.o drbd_nl.o + +obj-$(CONFIG_BLK_DEV_DRBD) += drbd.o Don't use foo-objs, use foo-y instead. +#undef DEVICE_NAME +#define DEVICE_NAME "drbd" This is never actually defined/redefined anywhere else. Just hardcode the "drbd" in your printk strings and save yourself 5-6 characters per use. +/* I don't remember why XCPU ... + * This is used to wake the asender, + * and to interrupt sending the sending task + * on disconnect. + */ +#define DRBD_SIG SIGXCPU + +/* This is used to stop/restart our threads. + * Cannot use SIGTERM nor SIGKILL, since these + * are sent out by init on runlevel changes + * I choose SIGHUP for now. + * + * FIXME btw, we should register some reboot notifier. + */ +#define DRBD_SIGKILL SIGHUP Don't use signals between kernel threads, use proper primitives like notifiers and waitqueues, which means you should also probably switch away from kernel_thread() to the kthread_*() APIs. Also you should fix this FIXME or remove it if it no longer applies:-D. +#ifdef PARANOIA +# define PARANOIA_BUG_ON(x) BUG_ON(x) +#else +# define PARANOIA_BUG_ON(x) +#endif This is only ever used in one place for a simple != test: + PARANOIA_BUG_ON(w != &mdev->resync_work); Just delete PARANOIA_BUG_ON and convert the above to a straight BUG_ON() +#define STATIC +#define STATIC static These two lines are found in different files, but the symbol "STATIC" isn't used anywhere. Just get rid of it. +/* handy macro: DUMPP(somepointer) */ +#define DUMPP(A) ERR( #A " = %p in %s:%d\n", (A), __FILE__, __LINE__); [...] +/* Info: do not remove the spaces around the "," before ## + * Otherwise this is not portable from gcc-2.95 to gcc-3.3 */ +#define PRINTK(level, fmt, args...) \ + printk(level DEVICE_NAME "%d: " fmt, \ + mdev->minor , ##args) + +#define ALERT(fmt, args...) PRINTK(KERN_ALERT, fmt , ##args) [...] No more custom debugging macros please, we have plenty of standardized ones in the kernel already (and all togther too many nonstandardized ones). Please take a good look at dev_printk, etc to make some of your printk()s shorter, but don't add more icky macros. Also gcc < 3.1 is now unsupported, so please remove gcc-2.95 portability comments/cruft (although in this case the code itself doesn't need changing, just the comments). +/* see kernel/printk.c:printk_ratelimit + * macro, so it is easy do have independend rate limits at different locations + * "initializer element not constant ..." with kernel 2.4 :( + * so I initialize toks to something large + */ +#define DRBD_ratelimit(ratelimit_jiffies, ratelimit_burst) \ Any particular reason you can't just use printk_ratelimit for this? Also you should remove any linux 2.4-related code/comments as they won't apply for code submitted to 2.6 mainline. +#ifdef DBG_ASSERTS +extern void drbd_assert_breakpoint(struct drbd_conf *, char *, char *, int ); +# define D_ASSERT(exp) if (!(exp)) \ + drbd_assert_breakpoint(mdev, #exp, __FILE__, __LINE__) +#else +# define D_ASSERT(exp) if (!(exp)) \ + ERR("ASSERT( " #exp " ) in %s:%d\n", __FILE__, __LINE__) +#endif +#define ERR_IF(exp) if (({ \ + int _b = (exp) != 0; \ + if (_b) ERR("%s: (" #exp ") in %s:%d\n", \ + __func__, __FILE__, __LINE__); \ + _b; \ + })) Yuck, more debugging macros. +/* Defines to control fault insertion */ +enum { + DRBD_FAULT_MD_WR = 0, /* meta data write */ + DRBD_FAULT_MD_RD, /* read */ + DRBD_FAULT_RS_WR, /* resync */ + DRBD_FAULT_RS_RD, + DRBD_FAULT_DT_WR, /* data */ + DRBD_FAULT_DT_RD, + DRBD_FAULT_DT_RA, /* data read ahead */ + + DRBD_FAULT_MAX, +}; We have some existing failure-injection code, any chance you could rip out your custom logic and just plug that? I haven't looked over it, though, so I can't really offer any useful suggestions about it. +#include <linux/stringify.h> Put the include at the top of the file, please? +/* integer division, round _UP_ to the next integer */ +#define div_ceil(A, B) ( (A)/(B) + ((A)%(B) ? 1 : 0) ) +/* usual integer division */ +#define div_floor(A, B) ( (A)/(B) ) Your div_ceil function looks OK but I think we already have a standard one somewhere. You might remove that div_floor function, though, as it isn't used anywhere. +/* + * Compatibility Section + *************************/ + +#define LOCK_SIGMASK(task, flags) spin_lock_irqsave(&task->sighand->siglock, flags) +#define UNLOCK_SIGMASK(task, flags) spin_unlock_irqrestore(&task->sighand->siglock, flags) +#define RECALC_SIGPENDING() recalc_sigpending(); These aren't used anywhere, remove please? +#if defined(DBG_SPINLOCKS) && defined(__SMP__) +# define MUST_HOLD(lock) if (!spin_is_locked(lock)) { ERR("Not holding lock! in %s\n", __FUNCTION__ ); } +#else +# define MUST_HOLD(lock) +#endif Why not just open-code "WARN_ON_SMP(!spin_is_locked(lock))" in the few places this is used? Alternatively if you need this to actually BUG(), you might either add a BUG_ON_SMP() to include/asm-generic/bug.h or add a helper to include/linux/spinlock.h: #if defined(CONFIG_SMP) && defined(CONFIG_DEBUG_SPINLOCK) # define spin_lock_musthold(lock) BUG_ON(!spin_is_locked(lock)) #else # define spin_lock_musthold(lock) do { } while(0) #endif +#define SET_MDEV_MAGIC(x) \ + ({ typecheck(struct drbd_conf*, x); \ + (x)->magic = (long)(x) ^ DRBD_MAGIC; }) +#define IS_VALID_MDEV(x) \ + ( typecheck(struct drbd_conf*, x) && \ + ((x) ? (((x)->magic ^ DRBD_MAGIC) == (long)(x)):0)) SET_MDEV_MAGIC is only used in one place, can't you open-code it there? Even if it were used lots of places a little inline function would handle the "typecheck" for you and be easier to read. The IS_VALID_MDEV() isn't used anywhere, so delete it. +enum Drbd_thread_state { + None, + Running, + Exiting, + Restarting +}; + +struct Drbd_thread { + spinlock_t t_lock; + struct task_struct *task; + struct completion startstop; + enum Drbd_thread_state t_state; + int (*function) (struct Drbd_thread *); + struct drbd_conf *mdev; +}; As somebody already mentioned, you need to switch from kernel_thread to kthread_* helpers. +/* + * Having this as the first member of a struct provides sort of "inheritance". + * "derived" structs can be "drbd_queue_work()"ed. + * The callback should know and cast back to the descendant struct. + * drbd_request and Tl_epoch_entry are descendants of drbd_work. + */ +struct drbd_work; +typedef int (*drbd_work_cb)(struct drbd_conf *, struct drbd_work *, int cancel); +struct drbd_work { + struct list_head list; + drbd_work_cb cb; +}; + +struct drbd_barrier; +struct drbd_request { + struct drbd_work w; Eww, please don't do opencoded casting of these. You should use the proper container_of() macro instead: > struct foo { > int a; > }; > struct bar { > int b; > struct foo thefoo; > }; > void mycallback(struct foo *myfoo) > { > struct bar *mybar = container_of(myfoo, struct bar, thefoo); > process_a_bar(mybar); > } It's not *much* better typechecking, but it's at least a little. It also lets you put a struct drbd_work at a non-zero offset inside of some other struct, as container_of() does internal subtraction of the offsetof(). +/* THINK maybe we actually want to use the default "event/%s" worker threads + * or similar in linux 2.6, which uses per cpu data and threads. + * + * To be general, this might need a spin_lock member. + * For now, please use the mdev->req_lock to protect list_head, + * see drbd_queue_work below. + */ +struct drbd_work_queue { + struct list_head q; + struct semaphore s; /* producers up it, worker down()s it */ + spinlock_t q_lock; /* to protect the list. */ +}; Umm, how about fixing this to actually use proper workqueues or something instead of this open-coded mess? +/* for sync_conf and other types... */ +#define PACKET(name, number, fields) struct name { fields }; +#define INTEGER(pn, pr, member) int member; +#define INT64(pn, pr, member) __u64 member; +#define BIT(pn, pr, member) unsigned member : 1; +#define STRING(pn, pr, member, len) \ + unsigned char member[len]; int member ## _len; +#include "linux/drbd_nl.h" The only light at the end of this tunnel is the greedily burning fires of Macro Hell(TM). Isn't there any better way to do this than all those horrid macros? I can sorta see what drbd_nl.h is trying to do, but the way it's included multiple times with all those insane macros defined makes it really hard to mentally parse. +#ifdef PARANOIA + long magic; +#endif Any particular reason you can't just unconditionally use a magic number? It's little more than an extra word per device and a couple simple integer tests. +/* returns 1 if it was successfull, + * returns 0 if there was no data socket. + * so wherever you are going to use the data.socket, e.g. do + * if (!drbd_get_data_sock(mdev)) + * return 0; + * CODE(); + * drbd_put_data_sock(mdev); + */ +static inline int drbd_get_data_sock(struct drbd_conf *mdev) +{ + down(&mdev->data.mutex); + /* drbd_disconnect() could have called drbd_free_sock() + * while we were waiting in down()... */ + if (unlikely(mdev->data.socket == NULL)) { + up(&mdev->data.mutex); + return 0; + } + return 1; +} Any reason this can't be open-coded? The extra unlock logic should just be moved into the cleanup handlers for the appropriate function. It looks like you are using the semaphore as a simple binary mutex, so please switch this to "struct mutex" as well. +#if BITS_PER_LONG == 32 +#define LN2_BPL 5 +#define cpu_to_lel(A) cpu_to_le32(A) +#define lel_to_cpu(A) le32_to_cpu(A) +#elif BITS_PER_LONG == 64 +#define LN2_BPL 6 +#define cpu_to_lel(A) cpu_to_le64(A) +#define lel_to_cpu(A) le64_to_cpu(A) +#else +#error "LN2 of BITS_PER_LONG unknown!" +#endif Hrm, any reason you can't just make the bitmap an array of __u64 and always use the cpu_to_le64()/le64_to_cpu() functions? +/* I want the packet to fit within one page + * THINK maybe use a special bitmap header, + * including offset and compression scheme and whatnot + * Do not use PAGE_SIZE here! Use a architecture agnostic constant! + */ +#define BM_PACKET_WORDS ((4096-sizeof(struct Drbd_Header))/sizeof(long)) Yuck. Definitely use PAGE_SIZE here, so at least if it's broken on an arch with multiple page sizes, somebody can grep for PAGE_SIZE to fix it. It also means that on archs/configs with 8k or 64k pages you won't waste a bunch of memory. +/* Dynamic tracing framework */ +#ifdef ENABLE_DYNAMIC_TRACE + +extern int trace_type; +extern int trace_devs; [...] AGH!! Not another dynamic tracing framework! Please use one of the existing solutions like kprobes or something. Maybe define some static tracepoints if that would be useful? +static inline void drbd_tcp_cork(struct socket *sock) +{ +#if 1 + mm_segment_t oldfs = get_fs(); + int val = 1; + + set_fs(KERNEL_DS); + tcp_setsockopt(sock->sk, SOL_TCP, TCP_CORK, (char *)&val, sizeof(val) ); + set_fs(oldfs); +#else + tcp_sk(sock->sk)->nonagle |= TCP_NAGLE_CORK; +#endif +} Yuck, why'd you do it this way? Probably because your tcp_sk(sock->sk) stuff doesn't have proper locking, I'll bet. You can avoid all the extra wrapper crap by just looking in "do_tcp_setsockopt" and taking the appropriate lock: static inline void drbd_tcp_cork(struct socket *sock) { struct sock *sk = sock->sk; lock_sock(sk); tcp_sk(sk)->nonagle |= TCP_NAGLE_CORK; release-sock(sk); } On the other hand, none of the currently in-tree network block devices or network filesystems seem to feel the need to try to TCP_CORK their output, why does drbd? +#define peer_mask role_mask +#define pdsk_mask disk_mask +#define susp_mask 1 +#define user_isp_mask 1 +#define aftr_isp_mask 1 + +#define NS(T, S) \ + ({ union drbd_state_t mask; mask.i = 0; mask.T = T##_mask; mask; }), \ + ({ union drbd_state_t val; val.i = 0; val.T = (S); val; }) +#define NS2(T1, S1, T2, S2) \ + ({ union drbd_state_t mask; mask.i = 0; mask.T1 = T1##_mask; \ + mask.T2 = T2##_mask; mask; }), \ + ({ union drbd_state_t val; val.i = 0; val.T1 = (S1); \ + val.T2 = (S2); val; }) +#define NS3(T1, S1, T2, S2, T3, S3) \ + ({ union drbd_state_t mask; mask.i = 0; mask.T1 = T1##_mask; \ + mask.T2 = T2##_mask; mask.T3 = T3##_mask; mask; }), \ + ({ union drbd_state_t val; val.i = 0; val.T1 = (S1); \ + val.T2 = (S2); val.T3 = (S3); val; }) + +#define _NS(D, T, S) \ + D, ({ union drbd_state_t ns; ns.i = D->state.i; ns.T = (S); ns; }) +#define _NS2(D, T1, S1, T2, S2) \ + D, ({ union drbd_state_t ns; ns.i = D->state.i; ns.T1 = (S1); \ + ns.T2 = (S2); ns; }) +#define _NS3(D, T1, S1, T2, S2, T3, S3) \ + D, ({ union drbd_state_t ns; ns.i = D->state.i; ns.T1 = (S1); \ + ns.T2 = (S2); ns.T3 = (S3); ns; }) Grumble. When I earlier said I thought I was in macro hell, well, I was wrong. *THIS* is macro hell. What the fsck is that supposed to do? And it doesn't even include a *SINGLE* comment!!! +static inline void drbd_state_lock(struct drbd_conf *mdev) +{ + wait_event(mdev->misc_wait, + !test_and_set_bit(CLUSTER_ST_CHANGE, &mdev->flags)); +} Umm, what's wrong with a mutex here? +static inline int semaphore_is_locked(struct semaphore *s) +{ + if (!down_trylock(s)) { + up(s); + return 0; + } + return 1; +} This would be better implemented in the mutex/semaphore generic code, instead of stuffed in a network blockdev. On the other hand, since the only user is this: + D_ASSERT(semaphore_is_locked(&mdev->md_io_mutex)); Can't you just either get rid of the check or open-code it? It would also give you a chance to substitute D_ASSERT(foo) for a proper BUG_ON(!foo) :-D +static inline void +_drbd_queue_work(struct drbd_work_queue *q, struct drbd_work *w) [...] Yeah, this stuff should really be fixed to use generic workqueues or something. +#define ERR_IF_CNT_IS_NEGATIVE(which) \ + if (atomic_read(&mdev->which) < 0) \ + ERR("in %s:%d: " #which " = %d < 0 !\n", \ + __func__ , __LINE__ , \ + atomic_read(&mdev->which)) Another macro with an implicit parameter (mdev). Please fix all of these +#define dec_unacked(mdev) do { \ + typecheck(struct drbd_conf *, mdev); \ + atomic_dec(&mdev->unacked_cnt); \ + ERR_IF_CNT_IS_NEGATIVE(unacked_cnt); } while (0) [...] More macros that should be inline functions. + if (!have_net_conf) dec_net(mdev); There's lots of coding-style violations like these in there. You should put the "dec_net(mdev);" indented on the line *AFTER* the if statememt. Please read linux/Documentation/CodingStyle for full details. + int mxb = 1000000; /* arbitrary limit on open requests */ Arbitrary limits are usually bad. Derive it from system properties, make it user-configurable, etc. +/* I'd like to use wait_event_lock_irq, + * but I'm not sure when it got introduced, + * and not sure when it has 3 or 4 arguments */ +static inline void inc_ap_bio(struct drbd_conf *mdev) [...] + spin_lock_irq(&mdev->req_lock); + while (!__inc_ap_bio_cond(mdev)) { + prepare_to_wait(&mdev->misc_wait, &wait, TASK_UNINTERRUPTIBLE); + spin_unlock_irq(&mdev->req_lock); + schedule(); + finish_wait(&mdev->misc_wait, &wait); + spin_lock_irq(&mdev->req_lock); + } + spin_unlock_irq(&mdev->req_lock); Since you're submitting for upstream inclusion you can just delete the extra backwards-compatible opencoded stuff and use the generic helper. +#define ARRY_SIZE(A) (sizeof(A)/sizeof(A[0])) There's a generic ARRAY_SIZE(foo) somewhere in the kernel sources. Use that one please. +/* this is developers aid only! */ +#define PARANOIA_ENTRY() BUG_ON(test_and_set_bit(__LC_PARANOIA, &lc->flags)) +#define PARANOIA_LEAVE() do { clear_bit(__LC_PARANOIA, &lc->flags); smp_mb__after_clear_bit(); } while (0) +#define RETURN(x...) do { PARANOIA_LEAVE(); return x ; } while (0) What is the PARANOIA flag intended to verify? This needs better commenting or removal. Don't use macros which take implicit arguments (IE: the lc value is used in the macro without it being passed as a parameter). Since it's debugging code you could also replace the open-coded bit trylock with a blocking spinlock. If something hangs, just turn on lockdep and it will report *MUCH* more useful information about the deadlock. Macros which do a "return" are discouraged, but since this one is called RETURN() it might be OK. On the other hand, you should probably rename it as PARANOIA_RETURN(). +int _drbd_md_sync_page_io(struct drbd_conf *mdev, + struct drbd_backing_dev *bdev, + struct page *page, sector_t sector, + int rw, int size) +{ [...] + ok = (bio_add_page(bio, page, size, 0) == size); + if (!ok) goto out; Ewwww, can somebody say "CodingStyle"? Don't put an if() and its result on a single line, and just use a straight if instead: > if (bio_add_page(bio, page, size, 0) != size) > goto out; + mask = ( hardsect / MD_HARDSECT ) - 1; + D_ASSERT( mask == 1 || mask == 3 || mask == 7 ); + D_ASSERT( hardsect == (mask+1) * MD_HARDSECT ); Problematic spaces and parentheses. Again, see Documentation/CodingStyle for the preferred forms. On the other hand, sometimes little code-style things like that are left to the maintainer if they really strongly prefer it one particular way. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists