Appendix B. Login Phase Examples
In the first example, the initiator and target authenticate each other via Kerberos: I-> Login (CSG,NSG=0,1 T=1) InitiatorName=iqn.1999-07.com.os:hostid.77 TargetName=iqn.1999-07.com.example:diskarray.sn.88 AuthMethod=KRB5,SRP,None T-> Login (CSG,NSG=0,0 T=0) AuthMethod=KRB5 I-> Login (CSG,NSG=0,1 T=1) KRB_AP_REQ=<krb_ap_req> (krb_ap_req contains the Kerberos V5 ticket and authenticator with MUTUAL-REQUIRED set in the ap-options field) If the authentication is successful, the target proceeds with: T-> Login (CSG,NSG=0,1 T=1) KRB_AP_REP=<krb_ap_rep> (krb_ap_rep is the Kerberos V5 mutual authentication reply) If the authentication is successful, the initiator may proceed with: I-> Login (CSG,NSG=1,0 T=0) FirstBurstLength=8192 T-> Login (CSG,NSG=1,0 T=0) FirstBurstLength=4096 MaxBurstLength=8192 I-> Login (CSG,NSG=1,0 T=0) MaxBurstLength=8192 ... more iSCSI Operational Parameters T-> Login (CSG,NSG=1,0 T=0) ... more iSCSI Operational Parameters And at the end: I-> Login (CSG,NSG=1,3 T=1) optional iSCSI parameters T-> Login (CSG,NSG=1,3 T=1) "login accept"
If the initiator's authentication by the target is not successful, the target responds with: T-> Login "login reject" instead of the Login KRB_AP_REP message, and it terminates the connection. If the target's authentication by the initiator is not successful, the initiator terminates the connection (without responding to the Login KRB_AP_REP message). In the next example, only the initiator is authenticated by the target via Kerberos: I-> Login (CSG,NSG=0,1 T=1) InitiatorName=iqn.1999-07.com.os:hostid.77 TargetName=iqn.1999-07.com.example:diskarray.sn.88 AuthMethod=SRP,KRB5,None T-> Login-PR (CSG,NSG=0,0 T=0) AuthMethod=KRB5 I-> Login (CSG,NSG=0,1 T=1) KRB_AP_REQ=krb_ap_req (MUTUAL-REQUIRED not set in the ap-options field of krb_ap_req) If the authentication is successful, the target proceeds with: T-> Login (CSG,NSG=0,1 T=1) I-> Login (CSG,NSG=1,0 T=0) ... iSCSI parameters T-> Login (CSG,NSG=1,0 T=0) ... iSCSI parameters . . . T-> Login (CSG,NSG=1,3 T=1)"login accept"
In the next example, the initiator and target authenticate each other via SRP: I-> Login (CSG,NSG=0,1 T=1) InitiatorName=iqn.1999-07.com.os:hostid.77 TargetName=iqn.1999-07.com.example:diskarray.sn.88 AuthMethod=KRB5,SRP,None T-> Login-PR (CSG,NSG=0,0 T=0) AuthMethod=SRP I-> Login (CSG,NSG=0,0 T=0) SRP_U=<user> TargetAuth=Yes T-> Login (CSG,NSG=0,0 T=0) SRP_N=<N> SRP_g=<g> SRP_s=<s> I-> Login (CSG,NSG=0,0 T=0) SRP_A=<A> T-> Login (CSG,NSG=0,0 T=0) SRP_B=<B> I-> Login (CSG,NSG=0,1 T=1) SRP_M=<M> If the initiator authentication is successful, the target proceeds with: T-> Login (CSG,NSG=0,1 T=1) SRP_HM=<H(A | M | K)> where N, g, s, A, B, M, and H(A | M | K) are defined in [RFC2945]. If the target authentication is not successful, the initiator terminates the connection; otherwise, it proceeds. I-> Login (CSG,NSG=1,0 T=0) ... iSCSI parameters T-> Login (CSG,NSG=1,0 T=0) ... iSCSI parameters
And at the end: I-> Login (CSG,NSG=1,3 T=1) optional iSCSI parameters T-> Login (CSG,NSG=1,3 T=1) "login accept" If the initiator authentication is not successful, the target responds with: T-> Login "login reject" instead of the T-> Login SRP_HM=<H(A | M | K)> message, and it terminates the connection. In the next example, only the initiator is authenticated by the target via SRP: I-> Login (CSG,NSG=0,1 T=1) InitiatorName=iqn.1999-07.com.os:hostid.77 TargetName=iqn.1999-07.com.example:diskarray.sn.88 AuthMethod=KRB5,SRP,None T-> Login-PR (CSG,NSG=0,0 T=0) AuthMethod=SRP I-> Login (CSG,NSG=0,0 T=0) SRP_U=<user> TargetAuth=No T-> Login (CSG,NSG=0,0 T=0) SRP_N=<N> SRP_g=<g> SRP_s=<s> I-> Login (CSG,NSG=0,0 T=0) SRP_A=<A> T-> Login (CSG,NSG=0,0 T=0) SRP_B=<B> I-> Login (CSG,NSG=0,1 T=1) SRP_M=<M>
If the initiator authentication is successful, the target proceeds with: T-> Login (CSG,NSG=0,1 T=1) I-> Login (CSG,NSG=1,0 T=0) ... iSCSI parameters T-> Login (CSG,NSG=1,0 T=0) ... iSCSI parameters And at the end: I-> Login (CSG,NSG=1,3 T=1) optional iSCSI parameters T-> Login (CSG,NSG=1,3 T=1) "login accept" In the next example, the initiator and target authenticate each other via CHAP: I-> Login (CSG,NSG=0,0 T=0) InitiatorName=iqn.1999-07.com.os:hostid.77 TargetName=iqn.1999-07.com.example:diskarray.sn.88 AuthMethod=KRB5,CHAP,None T-> Login-PR (CSG,NSG=0,0 T=0) AuthMethod=CHAP I-> Login (CSG,NSG=0,0 T=0) CHAP_A=<A1,A2> T-> Login (CSG,NSG=0,0 T=0) CHAP_A=<A1> CHAP_I=<I> CHAP_C=<C> I-> Login (CSG,NSG=0,1 T=1) CHAP_N=<N> CHAP_R=<R> CHAP_I=<I> CHAP_C=<C>
If the initiator authentication is successful, the target proceeds with: T-> Login (CSG,NSG=0,1 T=1) CHAP_N=<N> CHAP_R=<R> If the target authentication is not successful, the initiator aborts the connection; otherwise, it proceeds. I-> Login (CSG,NSG=1,0 T=0) ... iSCSI parameters T-> Login (CSG,NSG=1,0 T=0) ... iSCSI parameters And at the end: I-> Login (CSG,NSG=1,3 T=1) optional iSCSI parameters T-> Login (CSG,NSG=1,3 T=1) "login accept" If the initiator authentication is not successful, the target responds with: T-> Login "login reject" instead of the Login CHAP_R=<response> "proceed and change stage" message, and it terminates the connection. In the next example, only the initiator is authenticated by the target via CHAP: I-> Login (CSG,NSG=0,1 T=0) InitiatorName=iqn.1999-07.com.os:hostid.77 TargetName=iqn.1999-07.com.example:diskarray.sn.88 AuthMethod=KRB5,CHAP,None T-> Login-PR (CSG,NSG=0,0 T=0) AuthMethod=CHAP I-> Login (CSG,NSG=0,0 T=0) CHAP_A=<A1,A2>
T-> Login (CSG,NSG=0,0 T=0) CHAP_A=<A1> CHAP_I=<I> CHAP_C=<C> I-> Login (CSG,NSG=0,1 T=1) CHAP_N=<N> CHAP_R=<R> If the initiator authentication is successful, the target proceeds with: T-> Login (CSG,NSG=0,1 T=1) I-> Login (CSG,NSG=1,0 T=0) ... iSCSI parameters T-> Login (CSG,NSG=1,0 T=0) ... iSCSI parameters And at the end: I-> Login (CSG,NSG=1,3 T=1) optional iSCSI parameters T-> Login (CSG,NSG=1,3 T=1) "login accept" In the next example, the initiator does not offer any security parameters. It therefore may offer iSCSI parameters on the Login PDU with the T bit set to 1, and the target may respond with a final Login Response PDU immediately: I-> Login (CSG,NSG=1,3 T=1) InitiatorName=iqn.1999-07.com.os:hostid.77 TargetName=iqn.1999-07.com.example:diskarray.sn.88 ... iSCSI parameters T-> Login (CSG,NSG=1,3 T=1) "login accept" ... ISCSI parameters In the next example, the initiator does offer security parameters on the Login PDU, but the target does not choose any (i.e., chooses the "None" values): I-> Login (CSG,NSG=0,1 T=1) InitiatorName=iqn.1999-07.com.os:hostid.77 TargetName=iqn.1999-07.com.example:diskarray.sn.88 AuthMethod=KRB5,SRP,None
T-> Login-PR (CSG,NSG=0,1 T=1) AuthMethod=None I-> Login (CSG,NSG=1,0 T=0) ... iSCSI parameters T-> Login (CSG,NSG=1,0 T=0) ... iSCSI parameters And at the end: I-> Login (CSG,NSG=1,3 T=1) optional iSCSI parameters T-> Login (CSG,NSG=1,3 T=1) "login accept"Appendix C. SendTargets Operation
The text in this appendix is a normative part of this document. To reduce the amount of configuration required on an initiator, iSCSI provides the SendTargets Text Request. The initiator uses the SendTargets request to get a list of targets to which it may have access, as well as the list of addresses (IP address and TCP port) on which these targets may be accessed. To make use of SendTargets, an initiator must first establish one of two types of sessions. If the initiator establishes the session using the key "SessionType=Discovery", the session is a Discovery session, and a target name does not need to be specified. Otherwise, the session is a Normal operational session. The SendTargets command MUST only be sent during the Full Feature Phase of a Normal or Discovery session. A system that contains targets MUST support Discovery sessions on each of its iSCSI IP address-port pairs and MUST support the SendTargets command on the Discovery session. In a Discovery session, a target MUST return all path information (IP address-port pairs and Target Portal Group Tags) for the targets on the target Network Entity that the requesting initiator is authorized to access. A target MUST support the SendTargets command on operational sessions; these will only return path information about the target to which the session is connected and do not need to return information about other target names that may be defined in the responding system. An initiator MAY make use of the SendTargets command as it sees fit.
A SendTargets command consists of a single Text Request PDU. This PDU contains exactly one text key and value. The text key MUST be SendTargets. The expected response depends upon the value, as well as whether the session is a Discovery session or an operational session. The value must be one of: All The initiator is requesting that information on all relevant targets known to the implementation be returned. This value MUST be supported on a Discovery session and MUST NOT be supported on an operational session. <iSCSI-target-name> If an iSCSI Target Name is specified, the session should respond with addresses for only the named target, if possible. This value MUST be supported on Discovery sessions. A Discovery session MUST be capable of returning addresses for those targets that would have been returned had value=All been designated. <nothing> The session should only respond with addresses for the target to which the session is logged in. This MUST be supported on operational sessions and MUST NOT return targets other than the one to which the session is logged in. The response to this command is a Text Response that contains a list of zero or more targets and, optionally, their addresses. Each target is returned as a target record. A target record begins with the TargetName text key, followed by a list of TargetAddress text keys, and bounded by the end of the Text Response or the next TargetName key, which begins a new record. No text keys other than TargetName and TargetAddress are permitted within a SendTargets response. For the format of the TargetName, see Section 13.4. A Discovery session MAY respond to a SendTargets request with its complete list of targets, or with a list of targets that is based on the name of the initiator logged in to the session. A SendTargets response MUST NOT contain target names if there are no targets for the requesting initiator to access.
Each target record returned includes zero or more TargetAddress fields. Each target record starts with one text key of the form: TargetName=<target-name-goes-here> followed by zero or more address keys of the form: TargetAddress=<hostname-or-ipaddress>[:<tcp-port>], <portal-group-tag> The hostname-or-ipaddress contains a domain name, IPv4 address, or IPv6 address ([RFC4291]), as specified for the TargetAddress key. A hostname-or-ipaddress duplicated in TargetAddress responses for a given node (the port is absent or equal) would probably indicate that multiple address families are in use at once (IPv6 and IPv4). Each TargetAddress belongs to a portal group, identified by its numeric Target Portal Group Tag (see Section 13.9). The iSCSI Target Name, together with this tag, constitutes the SCSI port identifier; the tag only needs to be unique within a given target's name list of addresses. Multiple-connection sessions can span iSCSI addresses that belong to the same portal group. Multiple-connection sessions cannot span iSCSI addresses that belong to different portal groups. If a SendTargets response reports an iSCSI address for a target, it SHOULD also report all other addresses in its portal group in the same response. A SendTargets Text Response can be longer than a single Text Response PDU and makes use of the long Text Responses as specified. After obtaining a list of targets from the Discovery session, an iSCSI initiator may initiate new sessions to log in to the discovered targets for full operation. The initiator MAY keep the Discovery session open and MAY send subsequent SendTargets commands to discover new targets.
Examples: This example is the SendTargets response from a single target that has no other interface ports. The initiator sends a Text Request that contains: SendTargets=All The target sends a Text Response that contains: TargetName=iqn.1993-11.com.example:diskarray.sn.8675309 All the target had to return in this simple case was the target name. It is assumed by the initiator that the IP address and TCP port for this target are the same as those used on the current connection to the default iSCSI target. The next example has two internal iSCSI targets, each accessible via two different ports with different IP addresses. The following is the Text Response: TargetName=iqn.1993-11.com.example:diskarray.sn.8675309 TargetAddress=10.1.0.45:3000,1 TargetAddress=10.1.1.45:3000,2 TargetName=iqn.1993-11.com.example:diskarray.sn.1234567 TargetAddress=10.1.0.45:3000,1 TargetAddress=10.1.1.45:3000,2 Both targets share both addresses; the multiple addresses are likely used to provide multi-path support. The initiator may connect to either target name on either address. Each of the addresses has its own Target Portal Group Tag; they do not support spanning multiple- connection sessions with each other. Keep in mind that the Target Portal Group Tags for the two named targets are independent of one another; portal group "1" on the first target is not necessarily the same as portal group "1" on the second target. In the above example, a DNS host name or an IPv6 address could have been returned instead of an IPv4 address.
The next Text Response shows a target that supports spanning sessions across multiple addresses and further illustrates the use of the Target Portal Group Tags: TargetName=iqn.1993-11.com.example:diskarray.sn.8675309 TargetAddress=10.1.0.45:3000,1 TargetAddress=10.1.1.46:3000,1 TargetAddress=10.1.0.47:3000,2 TargetAddress=10.1.1.48:3000,2 TargetAddress=10.1.1.49:3000,3 In this example, any of the target addresses can be used to reach the same target. A single-connection session can be established to any of these TCP addresses. A multiple-connection session could span addresses .45 and .46 or .47 and .48 but cannot span any other combination. A TargetAddress with its own tag (.49) cannot be combined with any other address within the same session. This SendTargets response does not indicate whether .49 supports multiple connections per session; it is communicated via the MaxConnections text key upon login to the target.Appendix D. Algorithmic Presentation of Error Recovery Classes
This appendix illustrates the error recovery classes using a pseudo-programming language. The procedure names are chosen to be obvious to most implementers. Each of the recovery classes described has initiator procedures as well as target procedures. These algorithms focus on outlining the mechanics of error recovery classes and do not exhaustively describe all other aspects/cases. Examples of this approach are as follows: - Handling for only certain Opcode types is shown. - Only certain reason codes (e.g., Recovery in Logout command) are outlined. - Resultant cases, such as recovery of Synchronization on a header digest error, are considered out of scope in these algorithms. In this particular example, a header digest error may lead to connection recovery if some type of Sync and Steering layer is not implemented.
These algorithms strive to convey the iSCSI error recovery concepts in the simplest terms and are not designed to be optimal.D.1. General Data Structure and Procedure Description
This section defines the procedures and data structures that are commonly used by all the error recovery algorithms. The structures may not be the exhaustive representations of what is required for a typical implementation. Data structure definitions: struct TransferContext { int TargetTransferTag; int ExpectedDataSN; }; struct TCB { /* task control block */ Boolean SoFarInOrder; int ExpectedDataSN; /* used for both R2Ts and Data */ int MissingDataSNList[MaxMissingDPDU]; Boolean FbitReceived; Boolean StatusXferd; Boolean CurrentlyAllegiant; int ActiveR2Ts; int Response; char *Reason; struct TransferContext TransferContextList[MaxOutstandingR2T]; int InitiatorTaskTag; int CmdSN; int SNACK_Tag; }; struct Connection { struct Session SessionReference; Boolean SoFarInOrder; int CID; int State; int CurrentTimeout; int ExpectedStatSN; int MissingStatSNList[MaxMissingSPDU]; Boolean PerformConnectionCleanup; };
struct Session { int NumConnections; int CmdSN; int Maxconnections; int ErrorRecoveryLevel; struct iSCSIEndpoint OtherEndInfo; struct Connection ConnectionList[MaxSupportedConns]; }; Procedure descriptions: Receive-an-In-PDU(transport connection, inbound PDU); check-basic-validity(inbound PDU); Start-Timer(timeout handler, argument, timeout value); Build-And-Send-Reject(transport connection, bad PDU, reason code);D.2. Within-command Error Recovery Algorithms
D.2.1. Procedure Descriptions
Recover-Data-if-Possible(last required DataSN, task control block); Build-And-Send-DSnack(task control block); Build-And-Send-RDSnack(task control block); Build-And-Send-Abort(task control block); SCSI-Task-Completion(task control block); Build-And-Send-A-Data-Burst(transport connection, data-descriptor, task control block); Build-And-Send-R2T(transport connection, data-descriptor, task control block); Build-And-Send-Status(transport connection, task control block); Transfer-Context-Timeout-Handler(transfer context); Notes: - One procedure used in this section: the Handle-Status-SNACK-request is defined in Appendix D.3. - The response-processing pseudocode shown in the target algorithms applies to all solicited PDUs that carry the StatSN -- SCSI Response, Text Response, etc.
D.2.2. Initiator Algorithms
Recover-Data-if-Possible(LastRequiredDataSN, TCB) { if (operational ErrorRecoveryLevel > 0) { if (# of missing PDUs is trackable) { Note the missing DataSNs in TCB. if (the task spanned a change in MaxRecvDataSegmentLength) { if (TCB.StatusXferd is TRUE) drop the status PDU; Build-And-Send-RDSnack(TCB); } else { Build-And-Send-DSnack(TCB); } } else { TCB.Reason = "Protocol Service CRC error"; } } else { TCB.Reason = "Protocol Service CRC error"; } if (TCB.Reason == "Protocol Service CRC error") { Clear the missing PDU list in the TCB. if (TCB.StatusXferd is not TRUE) Build-And-Send-Abort(TCB); } } Receive-an-In-PDU(Connection, CurrentPDU) { check-basic-validity(CurrentPDU); if (Header-Digest-Bad) discard, return; Retrieve TCB for CurrentPDU.InitiatorTaskTag. if ((CurrentPDU.type == Data) or (CurrentPDU.type = R2T)) { if (Data-Digest-Bad for Data) { send-data-SNACK = TRUE; LastRequiredDataSN = CurrentPDU.DataSN; } else { if (TCB.SoFarInOrder = TRUE) { if (current DataSN is expected) { Increment TCB.ExpectedDataSN. } else { TCB.SoFarInOrder = FALSE; send-data-SNACK = TRUE; }
} else { if (current DataSN was considered missing) { remove current DataSN from missing PDU list. } else if (current DataSN is higher than expected) { send-data-SNACK = TRUE; } else { discard, return; } Adjust TCB.ExpectedDataSN if appropriate. } LastRequiredDataSN = CurrentPDU.DataSN - 1; } if (send-data-SNACK is TRUE and task is not already considered failed) { Recover-Data-if-Possible(LastRequiredDataSN, TCB); } if (missing data PDU list is empty) { TCB.SoFarInOrder = TRUE; } if (CurrentPDU.type == R2T) { Increment ActiveR2Ts for this task. Create a data-descriptor for the data burst. Build-And-Send-A-Data-Burst(Connection, data-descriptor, TCB); } } else if (CurrentPDU.type == Response) { if (Data-Digest-Bad) { send-status-SNACK = TRUE; } else { TCB.StatusXferd = TRUE; Store the status information in TCB. if (ExpDataSN does not match) { TCB.SoFarInOrder = FALSE; Recover-Data-if-Possible(current DataSN, TCB); } if (missing data PDU list is empty) { TCB.SoFarInOrder = TRUE; } } } else { /* REST UNRELATED TO WITHIN-COMMAND-RECOVERY, NOT SHOWN */ } if ((TCB.SoFarInOrder == TRUE) and (TCB.StatusXferd == TRUE)) { SCSI-Task-Completion(TCB); } }
D.2.3. Target Algorithms
Receive-an-In-PDU(Connection, CurrentPDU) { check-basic-validity(CurrentPDU); if (Header-Digest-Bad) discard, return; Retrieve TCB for CurrentPDU.InitiatorTaskTag. if (CurrentPDU.type == Data) { Retrieve TContext from CurrentPDU.TargetTransferTag; if (Data-Digest-Bad) { Build-And-Send-Reject(Connection, CurrentPDU, Payload-Digest-Error); Note the missing data PDUs in MissingDataRange[]. send-recovery-R2T = TRUE; } else { if (current DataSN is not expected) { Note the missing data PDUs in MissingDataRange[]. send-recovery-R2T = TRUE; } if (CurrentPDU.Fbit == TRUE) { if (current PDU is solicited) { Decrement TCB.ActiveR2Ts. } if ((current PDU is unsolicited and data received is less than I/O length and data received is less than FirstBurstLength) or (current PDU is solicited and the length of this burst is less than expected)) { send-recovery-R2T = TRUE; Note the missing data in MissingDataRange[]. } } } Increment TContext.ExpectedDataSN. if (send-recovery-R2T is TRUE and task is not already considered failed) { if (operational ErrorRecoveryLevel > 0) { Increment TCB.ActiveR2Ts. Create a data-descriptor for the data burst from MissingDataRange. Build-And-Send-R2T(Connection, data-descriptor, TCB); } else { if (current PDU is the last unsolicited) TCB.Reason = "Not enough unsolicited data"; else TCB.Reason = "Protocol Service CRC error"; } }
if (TCB.ActiveR2Ts == 0) { Build-And-Send-Status(Connection, TCB); } } else if (CurrentPDU.type == SNACK) { snack-failure = FALSE; if (operational ErrorRecoveryLevel > 0) { if (CurrentPDU.type == Data/R2T) { if (the request is satisfiable) { if (request for Data) { Create a data-descriptor for the data burst from BegRun and RunLength. Build-And-Send-A-Data-Burst(Connection, data-descriptor, TCB); } else { /* R2T */ Create a data-descriptor for the data burst from BegRun and RunLength. Build-And-Send-R2T(Connection, data-descriptor, TCB); } } else { snack-failure = TRUE; } } else if (CurrentPDU.type == status) { Handle-Status-SNACK-request(Connection, CurrentPDU); } else if (CurrentPDU.type == DataACK) { Consider all data up to CurrentPDU.BegRun as acknowledged. Free up the retransmission resources for that data. } else if (CurrentPDU.type == R-Data SNACK) { Create a data descriptor for a data burst covering all unacknowledged data. Build-And-Send-A-Data-Burst(Connection, data-descriptor, TCB); TCB.SNACK_Tag = CurrentPDU.SNACK_Tag; if (there's no more data to send) { Build-And-Send-Status(Connection, TCB); } } } else { /* operational ErrorRecoveryLevel = 0 */ snack-failure = TRUE; } if (snack-failure == TRUE) { Build-And-Send-Reject(Connection, CurrentPDU, SNACK-Reject); if (TCB.StatusXferd != TRUE) { TCB.Reason = "SNACK rejected"; Build-And-Send-Status(Connection, TCB); }
} } else { /* REST UNRELATED TO WITHIN-COMMAND-RECOVERY, NOT SHOWN */ } } Transfer-Context-Timeout-Handler(TContext) { Retrieve TCB and Connection from TContext. Decrement TCB.ActiveR2Ts. if (operational ErrorRecoveryLevel > 0 and task is not already considered failed) { Note the missing data PDUs in MissingDataRange[]. Create a data-descriptor for the data burst from MissingDataRange[]. Build-And-Send-R2T(Connection, data-descriptor, TCB); } else { TCB.Reason = "Protocol Service CRC error"; if (TCB.ActiveR2Ts = 0) { Build-And-Send-Status(Connection, TCB); } } }D.3. Within-connection Recovery Algorithms
D.3.1. Procedure Descriptions
Procedure descriptions: Recover-Status-if-Possible(transport connection, currently received PDU); Evaluate-a-StatSN(transport connection, currently received PDU); Retransmit-Command-if-Possible(transport connection, CmdSN); Build-And-Send-SSnack(transport connection); Build-And-Send-Command(transport connection, task control block); Command-Acknowledge-Timeout-Handler(task control block); Status-Expect-Timeout-Handler(transport connection); Build-And-Send-NOP-Out(transport connection); Handle-Status-SNACK-request(transport connection, Status SNACK PDU); Retransmit-Status-Burst(Status SNACK, task control block); Is-Acknowledged(beginning StatSN, run length);
Implementation-specific parameters that are tunable: InitiatorProactiveSNACKEnabled Notes: - The initiator algorithms only deal with unsolicited NOP-In PDUs for generating Status SNACKs. A solicited NOP-In PDU has an assigned StatSN that, when out of order, could trigger the out-of-order StatSN handling in within-command algorithms, again leading to Recover-Status-if-Possible. - The pseudocode shown may result in the retransmission of unacknowledged commands in more cases than necessary. This will not, however, affect the correctness of the operation because the target is required to discard the duplicate CmdSNs. - The procedure Build-And-Send-Async is defined in the connection recovery algorithms. - The procedure Status-Expect-Timeout-Handler describes how initiators may proactively attempt to retrieve the Status if they so choose. This procedure is assumed to be triggered much before the standard ULP timeout.D.3.2. Initiator Algorithms
Recover-Status-if-Possible(Connection, CurrentPDU) { if ((Connection.state == LOGGED_IN) and connection is not already considered failed) { if (operational ErrorRecoveryLevel > 0) { if (# of missing PDUs is trackable) { Note the missing StatSNs in Connection that were not already requested with SNACK; Build-And-Send-SSnack(Connection); } else { Connection.PerformConnectionCleanup = TRUE; } } else { Connection.PerformConnectionCleanup = TRUE; } if (Connection.PerformConnectionCleanup == TRUE) { Start-Timer(Connection-Cleanup-Handler, Connection, 0); } } }
Retransmit-Command-if-Possible(Connection, CmdSN) { if (operational ErrorRecoveryLevel > 0) { Retrieve the InitiatorTaskTag, and thus TCB for the CmdSN. Build-And-Send-Command(Connection, TCB); } } Evaluate-a-StatSN(Connection, CurrentPDU) { send-status-SNACK = FALSE; if (Connection.SoFarInOrder == TRUE) { if (current StatSN is the expected) { Increment Connection.ExpectedStatSN. } else { Connection.SoFarInOrder = FALSE; send-status-SNACK = TRUE; } } else { if (current StatSN was considered missing) { remove current StatSN from the missing list. } else { if (current StatSN is higher than expected){ send-status-SNACK = TRUE; } else { send-status-SNACK = FALSE; discard the PDU; } } Adjust Connection.ExpectedStatSN if appropriate. if (missing StatSN list is empty) { Connection.SoFarInOrder = TRUE; } } return send-status-SNACK; } Receive-an-In-PDU(Connection, CurrentPDU) { check-basic-validity(CurrentPDU); if (Header-Digest-Bad) discard, return; Retrieve TCB for CurrentPDU.InitiatorTaskTag. if (CurrentPDU.type == NOP-In) { if (the PDU is unsolicited) { if (current StatSN is not expected) { Recover-Status-if-Possible(Connection, CurrentPDU); }
if (current ExpCmdSN is not Session.CmdSN) { Retransmit-Command-if-Possible(Connection, CurrentPDU.ExpCmdSN); } } } else if (CurrentPDU.type == Reject) { if (it is a data digest error on immediate data) { Retransmit-Command-if-Possible(Connection, CurrentPDU.BadPDUHeader.CmdSN); } } else if (CurrentPDU.type == Response) { send-status-SNACK = Evaluate-a-StatSN(Connection, CurrentPDU); if (send-status-SNACK == TRUE) Recover-Status-if-Possible(Connection, CurrentPDU); } else { /* REST UNRELATED TO WITHIN-CONNECTION-RECOVERY, * NOT SHOWN */ } } Command-Acknowledge-Timeout-Handler(TCB) { Retrieve the Connection for TCB. Retransmit-Command-if-Possible(Connection, TCB.CmdSN); } Status-Expect-Timeout-Handler(Connection) { if (operational ErrorRecoveryLevel > 0) { Build-And-Send-NOP-Out(Connection); } else if (InitiatorProactiveSNACKEnabled){ if ((Connection.state == LOGGED_IN) and connection is not already considered failed) { Build-And-Send-SSnack(Connection); } } }
D.3.3. Target Algorithms
Handle-Status-SNACK-request(Connection, CurrentPDU) { if (operational ErrorRecoveryLevel > 0) { if (request for an acknowledged run) { Build-And-Send-Reject(Connection, CurrentPDU, Protocol-Error); } else if (request for an untransmitted run) { discard, return; } else { Retransmit-Status-Burst(CurrentPDU, TCB); } } else { Build-And-Send-Async(Connection, DroppedConnection, DefaultTime2Wait, DefaultTime2Retain); } }D.4. Connection Recovery Algorithms
D.4.1. Procedure Descriptions
Build-And-Send-Async(transport connection, reason code, minimum time, maximum time); Pick-A-Logged-In-Connection(session); Build-And-Send-Logout(transport connection, logout connection identifier, reason code); PerformImplicitLogout(transport connection, logout connection identifier, target information); PerformLogin(transport connection, target information); CreateNewTransportConnection(target information); Build-And-Send-Command(transport connection, task control block); Connection-Cleanup-Handler(transport connection); Connection-Resource-Timeout-Handler(transport connection); Quiesce-And-Prepare-for-New-Allegiance(session, task control block); Build-And-Send-Logout-Response(transport connection, CID of connection in recovery, reason code); Build-And-Send-TaskMgmt-Response(transport connection, task mgmt command PDU, response code); Establish-New-Allegiance(task control block, transport connection); Schedule-Command-To-Continue(task control block);
Note: - Transport exception conditions such as unexpected connection termination, connection reset, and hung connection while the connection is in the Full Feature Phase are all assumed to be asynchronously signaled to the iSCSI layer using the Transport_Exception_Handler procedure.D.4.2. Initiator Algorithms
Receive-an-In-PDU(Connection, CurrentPDU) { check-basic-validity(CurrentPDU); if (Header-Digest-Bad) discard, return; Retrieve TCB from CurrentPDU.InitiatorTaskTag. if (CurrentPDU.type == Async) { if (CurrentPDU.AsyncEvent == ConnectionDropped) { Retrieve the AffectedConnection for CurrentPDU.Parameter1. AffectedConnection.CurrentTimeout = CurrentPDU.Parameter3; AffectedConnection.State = CLEANUP_WAIT; Start-Timer(Connection-Cleanup-Handler, AffectedConnection, CurrentPDU.Parameter2); } else if (CurrentPDU.AsyncEvent == LogoutRequest)) { AffectedConnection = Connection; AffectedConnection.State = LOGOUT_REQUESTED; AffectedConnection.PerformConnectionCleanup = TRUE; AffectedConnection.CurrentTimeout = CurrentPDU.Parameter3; Start-Timer(Connection-Cleanup-Handler, AffectedConnection, 0); } else if (CurrentPDU.AsyncEvent == SessionDropped)) { for (each Connection) { Connection.State = CLEANUP_WAIT; Connection.CurrentTimeout = CurrentPDU.Parameter3; Start-Timer(Connection-Cleanup-Handler, Connection, CurrentPDU.Parameter2); } Session.state = FAILED; } } else if (CurrentPDU.type == LogoutResponse) { Retrieve the CleanupConnection for CurrentPDU.CID. if (CurrentPDU.Response = failure) { CleanupConnection.State = CLEANUP_WAIT;
} else { CleanupConnection.State = FREE; } } else if (CurrentPDU.type == LoginResponse) { if (this is a response to an implicit Logout) { Retrieve the CleanupConnection. if (successful) { CleanupConnection.State = FREE; Connection.State = LOGGED_IN; } else { CleanupConnection.State = CLEANUP_WAIT; DestroyTransportConnection(Connection); } } } else { /* REST UNRELATED TO CONNECTION-RECOVERY, * NOT SHOWN */ } if (CleanupConnection.State == FREE) { for (each command that was active on CleanupConnection) { /* Establish new connection allegiance */ NewConnection = Pick-A-Logged-In-Connection(Session); Build-And-Send-Command(NewConnection, TCB); } } } Connection-Cleanup-Handler(Connection) { Retrieve Session from Connection. if (Connection can still exchange iSCSI PDUs) { NewConnection = Connection; } else { Start-Timer(Connection-Resource-Timeout-Handler, Connection, Connection.CurrentTimeout); if (there are other logged-in connections) { NewConnection = Pick-A-Logged-In-Connection(Session); } else { NewConnection = CreateTransportConnection(Session.OtherEndInfo); Initiate an implicit Logout on NewConnection for Connection.CID. return; } } Build-And-Send-Logout(NewConnection, Connection.CID, RecoveryRemove); }
Transport_Exception_Handler(Connection) { Connection.PerformConnectionCleanup = TRUE; if (the event is an unexpected transport disconnect) { Connection.State = CLEANUP_WAIT; Connection.CurrentTimeout = DefaultTime2Retain; Start-Timer(Connection-Cleanup-Handler, Connection, DefaultTime2Wait); } else { Connection.State = FREE; } }D.4.3. Target Algorithms
Receive-an-In-PDU(Connection, CurrentPDU) { check-basic-validity(CurrentPDU); if (Header-Digest-Bad) discard, return; else if (Data-Digest-Bad) { Build-And-Send-Reject(Connection, CurrentPDU, Payload-Digest-Error); discard, return; } Retrieve TCB and Session. if (CurrentPDU.type == Logout) { if (CurrentPDU.ReasonCode = RecoveryRemove) { Retrieve the CleanupConnection from CurrentPDU.CID). for (each command active on CleanupConnection) { Quiesce-And-Prepare-for-New-Allegiance(Session, TCB); TCB.CurrentlyAllegiant = FALSE; } Cleanup-Connection-State(CleanupConnection); if ((quiescing successful) and (cleanup successful)) { Build-And-Send-Logout-Response(Connection, CleanupConnection.CID, Success); } else { Build-And-Send-Logout-Response(Connection, CleanupConnection.CID, Failure); } }
} else if ((CurrentPDU.type == Login) and operational ErrorRecoveryLevel == 2) { Retrieve the CleanupConnection from CurrentPDU.CID). for (each command active on CleanupConnection) { Quiesce-And-Prepare-for-New-Allegiance(Session, TCB); TCB.CurrentlyAllegiant = FALSE; } Cleanup-Connection-State(CleanupConnection); if ((quiescing successful) and (cleanup successful)) { Continue with the rest of the login processing; } else { Build-And-Send-Login-Response(Connection, CleanupConnection.CID, Target Error); } } } else if (CurrentPDU.type == TaskManagement) { if (CurrentPDU.function == "TaskReassign") { if (Session.ErrorRecoveryLevel < 2) { Build-And-Send-TaskMgmt-Response(Connection, CurrentPDU, "Task allegiance reassignment not supported"); } else if (task is not found) { Build-And-Send-TaskMgmt-Response(Connection, CurrentPDU, "Task not in task set"); } else if (task is currently allegiant) { Build-And-Send-TaskMgmt-Response(Connection, CurrentPDU, "Task still allegiant"); } else { Establish-New-Allegiance(TCB, Connection); TCB.CurrentlyAllegiant = TRUE; Schedule-Command-To-Continue(TCB); } } } else { /* REST UNRELATED TO CONNECTION-RECOVERY, * NOT SHOWN */ } }
Transport_Exception_Handler(Connection) { Connection.PerformConnectionCleanup = TRUE; if (the event is an unexpected transport disconnect) { Connection.State = CLEANUP_WAIT; Start-Timer(Connection-Resource-Timeout-Handler, Connection, (DefaultTime2Wait+DefaultTime2Retain)); if (this Session has Full Feature Phase connections left) { DifferentConnection = Pick-A-Logged-In-Connection(Session); Build-And-Send-Async(DifferentConnection, DroppedConnection, DefaultTime2Wait, DefaultTime2Retain); } } else { Connection.State = FREE; } }Appendix E. Clearing Effects of Various Events on Targets
E.1. Clearing Effects on iSCSI Objects
The following tables describe the target behavior on receiving the events specified in the rows of the table. The second table is an extension of the first table and defines clearing actions for more objects on the same events. The legend is: Y = Yes (cleared/discarded/reset on the event specified in the row). Unless otherwise noted, the clearing action is only applicable for the issuing initiator port. N = No (not affected on the event specified in the row, i.e., stays at previous value). NA = Not Applicable or Not Defined.
+------+------+------+------+------+ |IT (1)|IC (2)|CT (5)|ST (6)|PP (7)| +----------------------+------+------+------+------+------+ |connection failure (8)|Y |Y |N |N |Y | +----------------------+------+------+------+------+------+ |connection state |NA |NA |Y |N |NA | |timeout (9) | | | | | | +----------------------+------+------+------+------+------+ |session timeout/ |Y |Y |Y |Y |Y (14)| |closure/reinstatement | | | | | | |(10) | | | | | | +----------------------+------+------+------+------+------+ |session continuation |NA |NA |N (11)|N |NA | |(12) | | | | | | +----------------------+------+------+------+------+------+ |successful connection |Y |Y |Y |N |Y (13)| |close logout | | | | | | +----------------------+------+------+------+------+------+ |session failure (18) |Y |Y |N |N |Y | +----------------------+------+------+------+------+------+ |successful recovery |Y |Y |N |N |Y (13)| |Logout | | | | | | +----------------------+------+------+------+------+------+ |failed Logout |Y |Y |N |N |Y | +----------------------+------+------+------+------+------+ |connection Login |NA |NA |NA |Y (15)|NA | |(leading) | | | | | | +----------------------+------+------+------+------+------+ |connection Login |NA |NA |N (11)|N |Y | |(non-leading) | | | | | | +----------------------+------+------+------+------+------+ |TARGET COLD RESET (16)|Y (20)|Y |Y |Y |Y | +----------------------+------+------+------+------+------+ |TARGET WARM RESET (16)|Y (20)|Y |Y |Y |Y | +----------------------+------+------+------+------+------+ |LU reset (19) |Y (20)|Y |Y |Y |Y | +----------------------+------+------+------+------+------+ |power cycle (16) |Y |Y |Y |Y |Y | +----------------------+------+------+------+------+------+ (1) Incomplete TTTs (IT) are Target Transfer Tags on which the target is still expecting PDUs to be received. Examples include TTTs received via R2T, NOP-In, etc. (2) Immediate Commands (IC) are immediate commands, but waiting for execution on a target (for example, ABORT TASK SET).
(5) Connection Tasks (CT) are tasks that are active on the iSCSI connection in question. (6) Session Tasks (ST) are tasks that are active on the entire iSCSI session. A union of "connection tasks" on all participating connections. (7) Partial PDUs (PP) (if any) are PDUs that are partially sent and waiting for transport window credit to complete the transmission. (8) Connection failure is a connection exception condition - one of the transport connections shut down, transport connections reset, or transport connections timed out, which abruptly terminated the iSCSI Full Feature Phase connection. A connection failure always takes the connection state machine to the CLEANUP_WAIT state. (9) Connection state timeout happens if a connection spends more time than agreed upon during login negotiation in the CLEANUP_WAIT state, and this takes the connection to the FREE state (M1 transition in connection cleanup state diagram; see Section 8.2). (10) Session timeout, closure, and reinstatement are defined in Section 6.3.5. (11) This clearing effect is "Y" only if it is a connection reinstatement and the operational ErrorRecoveryLevel is less than 2. (12) Session continuation is defined in Section 6.3.6. (13) This clearing effect is only valid if the connection is being logged out on a different connection and when the connection being logged out on the target may have some partial PDUs pending to be sent. In all other cases, the effect is "NA". (14) This clearing effect is only valid for a "close the session" logout in a multi-connection session. In all other cases, the effect is "NA". (15) Only applicable if this leading connection login is a session reinstatement. If this is not the case, it is "NA". (16) This operation affects all logged-in initiators. (18) Session failure is defined in Section 6.3.6.
(19) This operation affects all logged-in initiators, and the clearing effects are only applicable to the LU being reset. (20) With standard multi-task abort semantics (Section 4.2.3.3), a TARGET WARM RESET or a TARGET COLD RESET or a LU reset would clear the active TTTs upon completion. However, the FastAbort multi-task abort semantics defined by Section 4.2.3.4 do not guarantee that the active TTTs are cleared by the end of the reset operations. In fact, the FastAbort semantics are designed to allow clearing the TTTs in a "lazy" fashion after the TMF Response is delivered. Thus, when TaskReporting=FastAbort (Section 13.23) is operational on a session, the clearing effects of reset operations on "Incomplete TTTs" is "N".
+------+-------+------+------+-------+ |DC (1)|DD (2) |SS (3)|CS (4)|DS (5) | +---------------------+------+-------+------+------+-------+ |connection failure |N |Y |N |N |N | +---------------------+------+-------+------+------+-------+ |connection state |Y |NA |Y |N |NA | |timeout | | | | | | +---------------------+------+-------+------+------+-------+ |session timeout/ |Y |Y |Y (7) |Y |NA | |closure/reinstatement| | | | | | +---------------------+------+-------+------+------+-------+ |session continuation |N (11)|NA (12)|NA |N |NA (13)| +---------------------+------+-------+------+------+-------+ |successful connection|Y |Y |Y |N |NA | |close Logout | | | | | | +---------------------+------+-------+------+------+-------+ |session failure |N |Y |N |N |N | +---------------------+------+-------+------+------+-------+ |successful recovery |Y |Y |Y |N |N | |Logout | | | | | | +---------------------+------+-------+------+------+-------+ |failed Logout |N |Y (9) |N |N |N | +---------------------+------+-------+------+------+-------+ |connection Login |NA |NA |N (8) |N (8) |NA | |(leading | | | | | | +---------------------+------+-------+------+------+-------+ |connection Login |N (11)|NA (12)|N (8) |N |NA (13)| |(non-leading) | | | | | | +---------------------+------+-------+------+------+-------+ |TARGET COLD RESET |Y |Y |Y |Y (10)|NA | +---------------------+------+-------+------+------+-------+ |TARGET WARM RESET |Y |Y |N |N |NA | +---------------------+------+-------+------+------+-------+ |LU reset |N |Y |N |N |N | +---------------------+------+-------+------+------+-------+ |power cycle |Y |Y |Y |Y (10)|NA | +---------------------+------+-------+------+------+-------+ (1) Discontiguous Commands (DC) are commands allegiant to the connection in question and waiting to be reordered in the iSCSI layer. All "Y"s in this column assume that the task causing the event (if indeed the event is the result of a task) is issued as an immediate command, because the discontiguities can be ahead of the task. (2) Discontiguous Data (DD) are data PDUs received for the task in question and waiting to be reordered due to prior discontiguities in the DataSN.
(3) "SS" refers to the StatSN. (4) "CS" refers to the CmdSN. (5) "DS" refers to the DataSN. (7) This action clears the StatSN on all the connections. (8) This sequence number is instantiated on this event. (9) A logout failure drives the connection state machine to the CLEANUP_WAIT state, similar to the connection failure event. Hence, it has a similar effect on this and several other protocol aspects. (10) This is cleared by virtue of the fact that all sessions with all initiators are terminated. (11) This clearing effect is "Y" if it is a connection reinstatement. (12) This clearing effect is "Y" only if it is a connection reinstatement and the operational ErrorRecoveryLevel is 2. (13) This clearing effect is "N" only if it is a connection reinstatement and the operational ErrorRecoveryLevel is 2.E.2. Clearing Effects on SCSI Objects
The only iSCSI protocol action that can effect clearing actions on SCSI objects is the "I_T nexus loss" notification (Section 6.3.5.1 ("Loss of Nexus Notification")). [SPC3] describes the clearing effects of this notification on a variety of SCSI attributes. In addition, SCSI standards documents (such as [SAM2] and [SBC2]) define additional clearing actions that may take place for several SCSI objects on SCSI events such as LU resets and power-on resets. Since iSCSI defines a TARGET COLD RESET as a "protocol-equivalent" to a target power-cycle, the iSCSI TARGET COLD RESET must also be considered as the power-on reset event in interpreting the actions defined in the SCSI standards. When the iSCSI session is reconstructed (between the same SCSI ports with the same nexus identifier) reestablishing the same I_T nexus, all SCSI objects that are defined to not clear on the "I_T nexus loss" notification event, such as persistent reservations, are automatically associated to this new session.
Acknowledgments
Several individuals on the original IPS Working Group made significant contributions to the original RFCs 3720, 3980, 4850, and 5048. Specifically, the authors of the original RFCs -- which herein are consolidated into a single document -- were the following: RFC 3720: Julian Satran, Kalman Meth, Costa Sapuntzakis, Mallikarjun Chadalapaka, Efri Zeidner RFC 3980: Marjorie Krueger, Mallikarjun Chadalapaka, Rob Elliott RFC 4850: David Wysochanski RFC 5048: Mallikarjun Chadalapaka Many thanks to Fred Knight for contributing to the UML notations and drawings in this document. We would in addition like to acknowledge the following individuals who contributed to this revised document: David Harrington, Paul Koning, Mark Edwards, Rob Elliott, and Martin Stiemerling. Thanks to Yi Zeng and Nico Williams for suggesting and/or reviewing Kerberos-related security considerations text. The authors gratefully acknowledge the valuable feedback during the Last Call review process from a number of individuals; their feedback significantly improved this document. The individuals were Stephen Farrell, Brian Haberman, Barry Leiba, Pete Resnick, Sean Turner, Alexey Melnikov, Kathleen Moriarty, Fred Knight, Mike Christie, Qiang Wang, Shiv Rajpal, and Andy Banta. Finally, this document also benefited from significant review contributions from the Storm Working Group at large. Comments may be sent to Mallikarjun Chadalapaka.
Authors' Addresses
Mallikarjun Chadalapaka Microsoft One Microsoft Way Redmond, WA 98052 USA EMail: cbm@chadalapaka.com Julian Satran Infinidat Ltd. EMail: julians@infinidat.com, julian@satran.net Kalman Meth IBM Haifa Research Lab Haifa University Campus - Mount Carmel Haifa 31905, Israel Phone +972.4.829.6341 EMail: meth@il.ibm.com David L. Black EMC Corporation 176 South St. Hopkinton, MA 01748 USA Phone +1 (508) 293-7953 EMail: david.black@emc.com