Blame TODO

Packit Service 646995
iSCSI DEVELOPMENT HOWTO AND TODO
Packit Service 646995
--------------------------------
Packit Service 646995
July 7th 2011
Packit Service 646995
Packit Service 646995
Packit Service 646995
If you are admin or user and just want to send a fix, just send the fix any
Packit Service 646995
way you can. We can port the patch to the proper tree and fix up the patch
Packit Service 646995
for you. Engineers that would like to do more advanced development then the
Packit Service 646995
following guideline should be followed.
Packit Service 646995
Packit Service 646995
Submitting Patches
Packit Service 646995
------------------
Packit Service 646995
Code should follow the Linux kernel codying style doc:
Packit Service 646995
http://www.kernel.org/doc/Documentation/CodingStyle
Packit Service 646995
Packit Service 646995
Patches should be submitted to the open-iscsi list open-iscsi@googlegroups.com.
Packit Service 646995
They should be made with "git diff" or "diff -up" or "diff -uprN", and
Packit Service 646995
kernel patches must have a "Signed-off-by" line. See section 12
Packit Service 646995
http://www.kernel.org/doc/Documentation/SubmittingPatches for more
Packit Service 646995
information on the the signed off line.
Packit Service 646995
Packit Service 646995
Getting the Code
Packit Service 646995
----------------
Packit Service 646995
Kernel patches should be made against the linux-2.6-iscsi tree. This can
Packit Service 646995
be downloaded from kernel.org with git with the following commands:
Packit Service 646995
Packit Service 646995
git clone git://git.kernel.org/pub/scm/linux/kernel/git/mnc/linux-2.6-iscsi.git
Packit Service 646995
Packit Service 646995
Userspace patches should be made against the open-iscsi git tree:
Packit Service 646995
Packit Service 646995
git clone git://git.kernel.org/pub/scm/linux/kernel/git/mnc/open-iscsi.git
Packit Service 646995
Packit Service 646995
Packit Service 646995
Packit Service 646995
KERNEL TODO ITEMS
Packit Service 646995
-----------------
Packit Service 646995
Packit Service 646995
1. Make iSCSI log messages humanly readable. In many cases the iscsi tools
Packit Service 646995
and modules will log a error number value. The most well known is conn
Packit Service 646995
error 1011. Users should not have to search on google for what this means.
Packit Service 646995
Packit Service 646995
We should:
Packit Service 646995
Packit Service 646995
1. Write a simple table to convert the error values to a string and print
Packit Service 646995
them out.
Packit Service 646995
Packit Service 646995
2. Document the values, how you commonly hit them and common solutions
Packit Service 646995
in the iSCSI docs.
Packit Service 646995
Packit Service 646995
See scsi_transport_iscsi.c:iscsi_conn_error_event for where the evil
Packit Service 646995
"detected conn error 1011" is printed. See the enum iscsi_err in iscsi_if.h
Packit Service 646995
for a definition of the error code values.
Packit Service 646995
Packit Service 646995
---------------------------------------------------------------------------
Packit Service 646995
Packit Service 646995
2. Implement iSCSI dev loss support.
Packit Service 646995
Packit Service 646995
Currently if a session is down for longer than replacement/recovery_timeout
Packit Service 646995
seconds, the iscsi layer will unblock the devices and fail IO. Other
Packit Service 646995
transport, like FC and SAS, will do something similar. FC has a
Packit Service 646995
fast_io_fail tmo which will unblock devices and fail IO, then it has a
Packit Service 646995
dev_loss_tmo which will delete the devices accessed through that port.
Packit Service 646995
Packit Service 646995
iSCSI needs to implement dev_loss_tmo behavior, because apps are beginning
Packit Service 646995
to expect this behavior. An initial path was made here:
Packit Service 646995
http://groups.google.com/group/open-iscsi/msg/031510ab4cecccfd?dmode=source 
Packit Service 646995
Packit Service 646995
Since all drivers want this behavior we want to make it common. We need to
Packit Service 646995
change the patch in that link to add a dev_loss_tmo handler callback to the
Packit Service 646995
scsi_transport_template struct, and add some common sysfs and helpers
Packit Service 646995
functions to manage the dev_loss_tmo variable.
Packit Service 646995
Packit Service 646995
Packit Service 646995
*Being worked on by Vikek S
Packit Service 646995
Packit Service 646995
---------------------------------------------------------------------------
Packit Service 646995
Packit Service 646995
3. Reduce locking contention between session lock.
Packit Service 646995
Packit Service 646995
The session lock is basically one big lock that protects everything
Packit Service 646995
in the iscsi_session. This lock could be broken down into smaller locks
Packit Service 646995
and maybe even replaced with something that would not require a lock.
Packit Service 646995
Packit Service 646995
For example:
Packit Service 646995
Packit Service 646995
1. The session lock serializes access to the current R2T the initiator is
Packit Service 646995
handling (a R2T from the target or the initialR2T if being used). libiscsi/
Packit Service 646995
libiscsi_tcp will call iscsi_tcp_get_curr_r2t and grab the session lock in
Packit Service 646995
the xmit path from the xmit thread and then in the recv path
Packit Service 646995
libiscsi_tcp/iscsi_tcp will call iscsi_tcp_r2t_rsp (this function is called
Packit Service 646995
with the session lock held). We could add a new per iscsi_task lock and
Packit Service 646995
use that to guard the R2T.
Packit Service 646995
Packit Service 646995
2. For iscsi_tcp and cxgb*i, libiscsi uses the session->cmdqueue linked list
Packit Service 646995
and the session lock to queue IO from the queuecommand function (run from
Packit Service 646995
scsi softirq or kblockd context) to the iscsi xmit thread. Once the task is
Packit Service 646995
sent from that thread, it is deleted from the list.
Packit Service 646995
Packit Service 646995
It seems we should be able to remove the linked list use here. The tasks
Packit Service 646995
are all preallocated in the session->cmds array. We can access that
Packit Service 646995
array and check the task->state (see fail_scsi_tasks for an example).
Packit Service 646995
We just need to come up with a way to safely set the task state,
Packit Service 646995
wake the xmit thread and make sure that tasks are executed in the order
Packit Service 646995
that the scsi layer sent them to our queuecommand function.
Packit Service 646995
Packit Service 646995
A starting point on the queueing:
Packit Service 646995
We might be able to create a workqueue per processor, queue the work,
Packit Service 646995
which in this case is the execution of the task, from the queuecommand,
Packit Service 646995
then rely on the work queue synchronization and serialization code.
Packit Service 646995
Not 100% sure about this.
Packit Service 646995
Packit Service 646995
Alternative to changing the threading:
Packit Service 646995
Can we figure out a way to just remove the xmit thread? We currently
Packit Service 646995
cannot because the network may only be able to send 1000 bytes, but
Packit Service 646995
to send the current command we need to send 2000. We cannot sleep
Packit Service 646995
from the queuecommand context until another 1000 bytes frees up and for
Packit Service 646995
iscsi_tcp we cannot sleep from the recv conext (this happens because we
Packit Service 646995
could have got a R2T from target and are handling it from the recv path).
Packit Service 646995
Packit Service 646995
Packit Service 646995
Note: that for iser and offload drivers like bnx2i and be2iscsi their
Packit Service 646995
is no xmit thread used.
Packit Service 646995
Packit Service 646995
Note2: cxgb*i does not actually need the xmit thread so a side project
Packit Service 646995
could be to convert that driver.
Packit Service 646995
Packit Service 646995
Packit Service 646995
---------------------------------------------------------------------------
Packit Service 646995
Packit Service 646995
4. Make memory access more efficient on multi-processor machines.
Packit Service 646995
We are moving twords per process queues in the block layer, so it would
Packit Service 646995
be a good idea to move the iscsi structs to be allocated on a per process
Packit Service 646995
basis.
Packit Service 646995
Packit Service 646995
---------------------------------------------------------------------------
Packit Service 646995
Packit Service 646995
5. Make blk_iopoll support (see block/blk-iopoll.c and be2iscsi for an
Packit Service 646995
example) being able to round robin IO across processors or complete
Packit Service 646995
on the processor it was queued on
Packit Service 646995
(today it always completes the IO on the processor the softirq was raised on),
Packit Service 646995
and convert bnx2i, ib_iser and cxgb*i to it.
Packit Service 646995
Packit Service 646995
Not sure if it will help iscsi_tcp and cxgb, because their completion is done
Packit Service 646995
from the network softirq which does polling already. With irq balancing it
Packit Service 646995
can also be spread over all processors too.
Packit Service 646995
Packit Service 646995
---------------------------------------------------------------------------
Packit Service 646995
Packit Service 646995
7. When userspace calls into the kernel using the iscsi netlink interface
Packit Service 646995
to execute oprations like creating/destroying a session, create a connection
Packit Service 646995
to a target, etc the rx_queue_mutex is held the entire time (see
Packit Service 646995
iscsi_if_rx for the iscsi netlink interface entry point). This means
Packit Service 646995
if the driver should block every thing will be held up.
Packit Service 646995
Packit Service 646995
iscsi_tcp does not block, but some offload drivers might for a couple seconds
Packit Service 646995
to 10 or 15 secs while it figures out what is going on or cleans up. This a 
Packit Service 646995
major problem for things like multipath where one connection blocking up the
Packit Service 646995
recovery of every other connection will delay IO from re-flowing quickly.
Packit Service 646995
Packit Service 646995
We should looking into breaking up the rx_queue_mutex into finer grained
Packit Service 646995
locks or making it multi threaded. For the latter we could queue operations
Packit Service 646995
into workqueues.
Packit Service 646995
Packit Service 646995
---------------------------------------------------------------------------
Packit Service 646995
Packit Service 646995
7. Add tracing support to iscsi modules. See the scsi layer's
Packit Service 646995
trace_scsi_dispatch_cmd_start for an example.
Packit Service 646995
Packit Service 646995
Well, actually in general look into all the tracing stuff available
Packit Service 646995
(trace_printk/ftrace, etc) and use one.
Packit Service 646995
Packit Service 646995
See http://lwn.net/Articles/291091/ for some details on what is out
Packit Service 646995
there. We can only use something that is upstream though.
Packit Service 646995
Packit Service 646995
---------------------------------------------------------------------------
Packit Service 646995
Packit Service 646995
8. Improve the iscsi driver logging. Each driver has a different
Packit Service 646995
way to control logging. We should unify them and make it manageable
Packit Service 646995
by iscsiadm. So each driver would use a common format, there would
Packit Service 646995
be a common kernel interface to set the logging level, etc.
Packit Service 646995
Packit Service 646995
---------------------------------------------------------------------------
Packit Service 646995
Packit Service 646995
9. Implement more features from the iSCSI RFC if they are worth it.
Packit Service 646995
Packit Service 646995
- Error Recovery Level (ERL) 1 support - will help tape support.
Packit Service 646995
- Multi R2T support - Might improve write performance.
Packit Service 646995
- OutOfOrder support - Might imrpove performance.
Packit Service 646995
Packit Service 646995
---------------------------------------------------------------------------
Packit Service 646995
Packit Service 646995
10. Add support for digest/CRC offload.
Packit Service 646995
Packit Service 646995
---------------------------------------------------------------------------
Packit Service 646995
Packit Service 646995
11. Finish intel IOAT support. I started this here:
Packit Service 646995
http://groups.google.com/group/open-iscsi/msg/2626b8606edbe690?dmode=source
Packit Service 646995
but could only test on boxes with 1 gig interfaces which showed no
Packit Service 646995
difference in performance. Intel had said they saw significant throughput
Packit Service 646995
gains when using 10 gig.
Packit Service 646995
Packit Service 646995
---------------------------------------------------------------------------
Packit Service 646995
Packit Service 646995
12. Remove the login buffer preallocated buffer. Storage drivers must be able
Packit Service 646995
to make forward process, so that they can always write out a page incase the
Packit Service 646995
kernel needs to allocate the page to another process. If the connection were
Packit Service 646995
to be disconnected and the initiator needed to relogin to the target at this
Packit Service 646995
time, we might not be abe to allocate a page for the login commands buffer.
Packit Service 646995
Packit Service 646995
To work around the problem the initiator prealloctes a 8K (sometimes
Packit Service 646995
more depending on the page size) buffer for each session (see iscsi_conn_setup'
Packit Service 646995
s __get_free_pages call). This is obviously very wasteful since it will be
Packit Service 646995
a rare occurrence. Can we think of a way to allow multiple sessions to
Packit Service 646995
be relogged in at the same time, but not have to preallocate so many
Packit Service 646995
buffers?
Packit Service 646995
Packit Service 646995
---------------------------------------------------------------------------
Packit Service 646995
Packit Service 646995
13. Support iSCSI over swap safely.
Packit Service 646995
Packit Service 646995
Basically just need to hook iscsi_tcp into the patches that
Packit Service 646995
were submitted here for NBD.
Packit Service 646995
Packit Service 646995
https://lwn.net/Articles/446831/
Packit Service 646995
Packit Service 646995
Packit Service 646995
---------------------------------------------------------------------------
Packit Service 646995
Packit Service 646995
Packit Service 646995
Packit Service 646995
Packit Service 646995
Packit Service 646995
USERSPACE TODO ITEMS
Packit Service 646995
--------------------
Packit Service 646995
1. The iscsi tools, iscsid, iscsiadm and iscsid, have a debug flag, -d N, that
Packit Service 646995
allows the user to control the amount of output that is logged. The argument
Packit Service 646995
N is a integer from 1 to 8, with 8 printing out the most output.
Packit Service 646995
Packit Service 646995
The problem is that the values from 1 to 8 do not really mean much. It would
Packit Service 646995
helpful if we could replace them with something that controls what exactly
Packit Service 646995
the iscsi tools and kernel modules log.
Packit Service 646995
Packit Service 646995
For example, if we made the debug level argument a bitmap then
Packit Service 646995
Packit Service 646995
iscsiadm -m node --login -d LOGIN_ERRS,PDUS,FUNCTION
Packit Service 646995
Packit Service 646995
might print out extended iscsi login error information (LOGIN_ERRS),
Packit Service 646995
the iSCSI packets that were sent/receieved (PDUS), and the functions
Packit Service 646995
that were run (FUNCTION). Note, the use of a bitmapp and the debug
Packit Service 646995
levels are just an example. Feel free to do something else.
Packit Service 646995
Packit Service 646995
Packit Service 646995
We would want to be able to have iscsiadm control the iscsi kernel
Packit Service 646995
logging as well. There are interfaces like
Packit Service 646995
/sys/module/libiscsi/paramters/*debug*
Packit Service 646995
/sys/module/libiscsi_tcp/paramters/*debug*
Packit Service 646995
/sys/module/iscsi_tcp/paramters/*debug*
Packit Service 646995
/sys/module/scsi_transport_iscsi/paramters/*debug*
Packit Service 646995
Packit Service 646995
but we would want to extend the debugging options to be finer grained
Packit Service 646995
and we would want to make it supportable by all iscsi drivers.
Packit Service 646995
(see #8 on the kernel todo).
Packit Service 646995
Packit Service 646995
---------------------------------------------------------------------------
Packit Service 646995
Packit Service 646995
2. "iscsiadm -m session -P 3" can print out a lot of information about the
Packit Service 646995
session, but not all configuration values are printed.
Packit Service 646995
Packit Service 646995
iscsiadm should be modified to print out other settings like timeouts,
Packit Service 646995
Chap settings,  the iSCSI values that were requested vs negotiated for, etc.
Packit Service 646995
Packit Service 646995
---------------------------------------------------------------------------
Packit Service 646995
Packit Service 646995
3. iscsiadm cannot update a setting of a running session. If you want
Packit Service 646995
to change a timeout you have to run the iscsiadm logout command,
Packit Service 646995
then update the record value, then login:
Packit Service 646995
Packit Service 646995
iscsiadm -m node -T target -p ip -u
Packit Service 646995
iscsidm -m node -T target -p ip -o update -n node.session.timeo.replacement_timeout -v 30
Packit Service 646995
iscsiadm -m node -T target -p ip -l
Packit Service 646995
Packit Service 646995
iscsiadm should be modified to allow updating of a setting without having
Packit Service 646995
to run the iscsiadm command.
Packit Service 646995
Packit Service 646995
Note that for some settings like iSCSI ones (ImmediateData, FirstBurstLength,
Packit Service 646995
etc)  that must be negotiated with the target we will have to logout the
Packit Service 646995
target then re-login, but we should not have to completely destroy the session
Packit Service 646995
and scsi devices like is done when running the iscsiadm logout command. We
Packit Service 646995
should be able to pass iscsid the new values and then have iscsid logout and
Packit Service 646995
relogin.
Packit Service 646995
Packit Service 646995
Other settings like the abort timeout will not need a logout/login. We can
Packit Service 646995
just pass those to the kernel or iscsid to use.
Packit Service 646995
Packit Service 646995
Packit Service 646995
*Being worked on by Tomoaki Nishimura
Packit Service 646995
Packit Service 646995
---------------------------------------------------------------------------
Packit Service 646995
Packit Service 646995
4. iscsiadm will attempt to perform logins/logouts in parallel. Running
Packit Service 646995
iscsiadm -m node -L, will cause iscsiadm to login to all portals with
Packit Service 646995
the startup=automatic field set at the same time.
Packit Service 646995
Packit Service 646995
To log into a target, iscsiadm opens a socket to iscsid, sends iscsid a
Packit Service 646995
request to login to a target, iscsid performs the iSCSI login operation,
Packit Service 646995
then iscsid sends iscsiadm a reply.
Packit Service 646995
Packit Service 646995
To perform multiple logins iscsiadm will open a socket for each login
Packit Service 646995
request, then wait for a reply. This is a problem because for 1000s of targets
Packit Service 646995
we will have 1000s of sockets open. There is a rlimit to control how many
Packit Service 646995
files a process can have open and iscsiadm currently runs setrlimit to
Packit Service 646995
increase this.
Packit Service 646995
Packit Service 646995
With users creating lots of virtual iscsi interfaces on the target and
Packit Service 646995
initiator with each having multiple paths it beomes inefficient to open
Packit Service 646995
a socket for each requests.
Packit Service 646995
Packit Service 646995
At the very least we want to handle setrlimit RLIMIT_NOFILE limit better,
Packit Service 646995
and it would be best to just stop openening a socket per login request.
Packit Service 646995
Packit Service 646995
---------------------------------------------------------------------------
Packit Service 646995
Packit Service 646995
6. Implement broadcast/multicasts support, so the initiator can
Packit Service 646995
find iSNS servers without the user having to set the iSNS server address.
Packit Service 646995
Packit Service 646995
See
Packit Service 646995
5.6.5.14. Name Service Heartbeat (Heartbeat)
Packit Service 646995
in
Packit Service 646995
http://tools.ietf.org/html//rfc4171
Packit Service 646995
Packit Service 646995
---------------------------------------------------------------------------
Packit Service 646995
Packit Service 646995
7. Open-iscsi uses the open-isns iSNS library. The library might be a little
Packit Service 646995
too complicated and a little too heavy for what we need. Investigate
Packit Service 646995
replacing it.
Packit Service 646995
Packit Service 646995
Also explore merging the open-isns and linux-isns projects, so we do not have
Packit Service 646995
to support multiple isns clients/servers in linux.
Packit Service 646995
Packit Service 646995
---------------------------------------------------------------------------
Packit Service 646995
Packit Service 646995
8. Implement the DHCP iSNS option support, so we the initiator can
Packit Service 646995
find the iSNS sever without the user having to set the iSNS server address.
Packit Service 646995
See:
Packit Service 646995
http://www.ietf.org/rfc/rfc4174.txt
Packit Service 646995
Packit Service 646995
---------------------------------------------------------------------------
Packit Service 646995
Packit Service 646995
9. Some iscsiadm/iscsid operations that access the iscsi DB and sysfs can be
Packit Service 646995
up to Big O(N^2). Some of the code was written when we thought 64 sessions
Packit Service 646995
would be a lot and the norm would be 4 or 8. Due to virtualization, cloud use,
Packit Service 646995
and targets like equallogic that do a target per logical unit (device) we can
Packit Service 646995
see 1000s of sessions.
Packit Service 646995
Packit Service 646995
- We should look into making the record DB more efficient. Maybe
Packit Service 646995
time to use a real DB (something small simple and efficient since this
Packit Service 646995
needs to run in places like the initramfs).
Packit Service 646995
Packit Service 646995
- Rewrite code to look up a running session so we do not have loop
Packit Service 646995
over every session in sysfs.
Packit Service 646995
Packit Service 646995
Packit Service 646995
---------------------------------------------------------------------------
Packit Service 646995
Packit Service 646995
10. Look into using udev's libudev for our sysfs access in iscsiadm/iscsid/
Packit Service 646995
iscsistart.
Packit Service 646995
Packit Service 646995
---------------------------------------------------------------------------
Packit Service 646995
Packit Service 646995
11. iSCSI lib.
Packit Service 646995
Packit Service 646995
I am working on this one. Hopefully it should be done soon.
Packit Service 646995
Packit Service 646995
---------------------------------------------------------------------------
Packit Service 646995
Packit Service 646995
12. Figure out how to stop using our own copy of iscsi_if.h, since
Packit Service 646995
it gets out of sync with the kernel version, and that's not good.
Packit Service 646995
Packit Service 646995
---------------------------------------------------------------------------