I'm not sure why you do this on shm segments only. Why not directly on
vmas ? 
I'm considering to add a similar thing to the mm_struct to allow
processes to allocate on their "homenode" only. But I could also live
with it being per VMA.
In case of a large number of such bindings or nodes it may make sense
to cache and share the zone lists, but that can be probably left for a
future patch.
> +static inline struct page *__page_cache_alloc(struct address_space *x, int cold)
> +{
> +	int gfp_mask;
> +	struct zonelist *zonelist;
> +
> +	gfp_mask = x->gfp_mask;
> +	if (cold)
> +		gfp_mask |= __GFP_COLD;
> +	if (!x->binding)
> +		zonelist = get_zonelist(gfp_mask);
> +	else
> +		zonelist = &x->binding->zonelist;
> +
> +	return __alloc_pages(gfp_mask, 0, zonelist);
> +}
I would move a function of this size out of line. In fact i think it
should even in the non NUMA case be a simple function call with zonelist
getting evaluated out-of-line.
> +asmlinkage unsigned long sys_membind(unsigned long start, unsigned long len, 
> +		unsigned long *mask_ptr, unsigned int mask_len, unsigned long policy)
> +{
> +	DECLARE_BITMAP(cpu_mask, NR_CPUS);
> +	DECLARE_BITMAP(node_mask, MAX_NUMNODES);
> +	struct vm_area_struct *vma = NULL;
> +	struct address_space *mapping;
> +	int error = 0;
> +
> +	/* Deal with getting cpu_mask from userspace & translating to node_mask */
> +	if (mask_len > NR_CPUS) {
> +		error = -EINVAL;
> +		goto out;
> +	}
I don't like that check. It requires hardcoding NR_CPUs (which can be
variable!) in the application. It would be better to allow arbitary
length arguments, but only error when there is a bit set outside
NR_CPUS. Of course arbitary would be hard regarding the copy from user,
so perhaps use some very big limit (e.g. one page worth of bitmap) 
> +	CLEAR_BITMAP(cpu_mask, NR_CPUS);
> +	CLEAR_BITMAP(node_mask, MAX_NUMNODES);
> +	if (copy_from_user(cpu_mask, mask_ptr, (mask_len+7)/8)) {
> +		error = -EFAULT;
> +		goto out;
> +	}
> +	cpumask_to_nodemask(cpu_mask, node_mask);
> +
> +	vma = find_vma(current->mm, start);
> +	if (!(vma && vma->vm_file && vma->vm_ops && 
> +		vma->vm_ops->nopage == shmem_nopage)) {
> +		/* This isn't a shm segment.  For now, we bail. */
> +		printk("%s: Can only bind shm(em) segments for now!\n", __FUNCTION__);
Instead of __FUNCTION__ i would use current->comm here. 
Or better perhaps just remove the error printks.
-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/