Blame docs/universal_types.md

rpm-build 7e4f6c
# Universal Types with BER/DER Decoder and DER Encoder
rpm-build 7e4f6c
rpm-build 7e4f6c
The *asn1crypto* library is a combination of universal type classes that
rpm-build 7e4f6c
implement BER/DER decoding and DER encoding, a PEM encoder and decoder, and a
rpm-build 7e4f6c
number of pre-built cryptographic type classes. This document covers the
rpm-build 7e4f6c
universal type classes.
rpm-build 7e4f6c
rpm-build 7e4f6c
For a general overview of ASN.1 as used in cryptography, please see
rpm-build 7e4f6c
[A Layman's Guide to a Subset of ASN.1, BER, and DER](http://luca.ntop.org/Teaching/Appunti/asn1.html).
rpm-build 7e4f6c
rpm-build 7e4f6c
This page contains the following sections:
rpm-build 7e4f6c
rpm-build 7e4f6c
 - [Universal Types](#universal-types)
rpm-build 7e4f6c
 - [Basic Usage](#basic-usage)
rpm-build 7e4f6c
 - [Sequence](#sequence)
rpm-build 7e4f6c
 - [Set](#set)
rpm-build 7e4f6c
 - [SequenceOf](#sequenceof)
rpm-build 7e4f6c
 - [SetOf](#setof)
rpm-build 7e4f6c
 - [Integer](#integer)
rpm-build 7e4f6c
 - [Enumerated](#enumerated)
rpm-build 7e4f6c
 - [ObjectIdentifier](#objectidentifier)
rpm-build 7e4f6c
 - [BitString](#bitstring)
rpm-build 7e4f6c
 - [Strings](#strings)
rpm-build 7e4f6c
 - [UTCTime](#utctime)
rpm-build 7e4f6c
 - [GeneralizedTime](#generalizedtime)
rpm-build 7e4f6c
 - [Choice](#choice)
rpm-build 7e4f6c
 - [Any](#any)
rpm-build 7e4f6c
 - [Specification via OID](#specification-via-oid)
rpm-build 7e4f6c
 - [Explicit and Implicit Tagging](#explicit-and-implicit-tagging)
rpm-build 7e4f6c
rpm-build 7e4f6c
## Universal Types
rpm-build 7e4f6c
rpm-build 7e4f6c
For general purpose ASN.1 parsing, the `asn1crypto.core` module is used. It
rpm-build 7e4f6c
contains the following classes, that parse, represent and serialize all of the
rpm-build 7e4f6c
ASN.1 universal types:
rpm-build 7e4f6c
rpm-build 7e4f6c
| Class              | Native Type                            | Implementation Notes                 |
rpm-build 7e4f6c
| ------------------ | -------------------------------------- | ------------------------------------ |
rpm-build 7e4f6c
| `Boolean`          | `bool`                                 |                                      |
rpm-build 7e4f6c
| `Integer`          | `int`                                  | may be `long` on Python 2            |
rpm-build 7e4f6c
| `BitString`        | `tuple` of `int` or `set` of `unicode` | `set` used if `_map` present         |
rpm-build 7e4f6c
| `OctetString`      | `bytes` (`str`)                        |                                      |
rpm-build 7e4f6c
| `Null`             | `None`                                 |                                      |
rpm-build 7e4f6c
| `ObjectIdentifier` | `str` (`unicode`)                      | string is dotted integer format      |
rpm-build 7e4f6c
| `ObjectDescriptor` |                                        | no native conversion                 |
rpm-build 7e4f6c
| `InstanceOf`       |                                        | no native conversion                 |
rpm-build 7e4f6c
| `Real`             |                                        | no native conversion                 |
rpm-build 7e4f6c
| `Enumerated`       | `str` (`unicode`)                      | `_map` must be set                   |
rpm-build 7e4f6c
| `UTF8String`       | `str` (`unicode`)                      |                                      |
rpm-build 7e4f6c
| `RelativeOid`      | `str` (`unicode`)                      | string is dotted integer format      |
rpm-build 7e4f6c
| `Sequence`         | `OrderedDict`                          |                                      |
rpm-build 7e4f6c
| `SequenceOf`       | `list`                                 |                                      |
rpm-build 7e4f6c
| `Set`              | `OrderedDict`                          |                                      |
rpm-build 7e4f6c
| `SetOf`            | `list`                                 |                                      |
rpm-build 7e4f6c
| `EmbeddedPdv`      | `OrderedDict`                          | no named field parsing               |
rpm-build 7e4f6c
| `NumericString`    | `str` (`unicode`)                      | no charset limitations               |
rpm-build 7e4f6c
| `PrintableString`  | `str` (`unicode`)                      | no charset limitations               |
rpm-build 7e4f6c
| `TeletexString`    | `str` (`unicode`)                      |                                      |
rpm-build 7e4f6c
| `VideotexString`   | `bytes` (`str`)                        | no unicode conversion                |
rpm-build 7e4f6c
| `IA5String`        | `str` (`unicode`)                      |                                      |
rpm-build 7e4f6c
| `UTCTime`          | `datetime.datetime`                    |                                      |
rpm-build 7e4f6c
| `GeneralizedTime`  | `datetime.datetime`                    | treated as UTC when no timezone      |
rpm-build 7e4f6c
| `GraphicString`    | `str` (`unicode`)                      | unicode conversion as latin1         |
rpm-build 7e4f6c
| `VisibleString`    | `str` (`unicode`)                      | no charset limitations               |
rpm-build 7e4f6c
| `GeneralString`    | `str` (`unicode`)                      | unicode conversion as latin1         |
rpm-build 7e4f6c
| `UniversalString`  | `str` (`unicode`)                      |                                      |
rpm-build 7e4f6c
| `CharacterString`  | `str` (`unicode`)                      | unicode conversion as latin1         |
rpm-build 7e4f6c
| `BMPString`        | `str` (`unicode`)                      |                                      |
rpm-build 7e4f6c
rpm-build 7e4f6c
For *Native Type*, the Python 3 type is listed first, with the Python 2 type
rpm-build 7e4f6c
in parentheses.
rpm-build 7e4f6c
rpm-build 7e4f6c
As mentioned next to some of the types, value parsing may not be implemented
rpm-build 7e4f6c
for types not currently used in cryptography (such as `ObjectDescriptor`,
rpm-build 7e4f6c
`InstanceOf` and `Real`). Additionally some of the string classes don't
rpm-build 7e4f6c
enforce character set limitations, and for some string types that accept all
rpm-build 7e4f6c
different encodings, the default encoding is set to latin1.
rpm-build 7e4f6c
rpm-build 7e4f6c
In addition, there are a few overridden types where various specifications use
rpm-build 7e4f6c
a `BitString` or `OctetString` type to represent a different type. These
rpm-build 7e4f6c
include:
rpm-build 7e4f6c
rpm-build 7e4f6c
| Class                | Native Type         | Implementation Notes            |
rpm-build 7e4f6c
| -------------------- | ------------------- | ------------------------------- |
rpm-build 7e4f6c
| `OctetBitString`     | `bytes` (`str`)     |                                 |
rpm-build 7e4f6c
| `IntegerBitString`   | `int`               | may be `long` on Python 2       |
rpm-build 7e4f6c
| `IntegerOctetString` | `int`               | may be `long` on Python 2       |
rpm-build 7e4f6c
rpm-build 7e4f6c
For situations where the DER encoded bytes from one type is embedded in another,
rpm-build 7e4f6c
the `ParsableOctetString` and `ParsableOctetBitString` classes exist. These
rpm-build 7e4f6c
function the same as `OctetString` and `OctetBitString`, however they also
rpm-build 7e4f6c
have an attribute `.parsed` and a method `.parse()` that allows for
rpm-build 7e4f6c
parsing the content as ASN.1 structures.
rpm-build 7e4f6c
rpm-build 7e4f6c
All of these overrides can be used with the `cast()` method to convert between
rpm-build 7e4f6c
them. The only requirement is that the class being casted to has the same tag
rpm-build 7e4f6c
as the original class. No re-encoding is done, rather the contents are simply
rpm-build 7e4f6c
re-interpreted.
rpm-build 7e4f6c
rpm-build 7e4f6c
```python
rpm-build 7e4f6c
from asn1crypto.core import BitString, OctetBitString, IntegerBitString
rpm-build 7e4f6c
rpm-build 7e4f6c
bit = BitString({
rpm-build 7e4f6c
    0, 0, 0, 0, 0, 0, 0, 1,
rpm-build 7e4f6c
    0, 0, 0, 0, 0, 0, 1, 0,
rpm-build 7e4f6c
})
rpm-build 7e4f6c
rpm-build 7e4f6c
# Will print (0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0)
rpm-build 7e4f6c
print(bit.native)
rpm-build 7e4f6c
rpm-build 7e4f6c
octet = bit.cast(OctetBitString)
rpm-build 7e4f6c
rpm-build 7e4f6c
# Will print b'\x01\x02'
rpm-build 7e4f6c
print(octet.native)
rpm-build 7e4f6c
rpm-build 7e4f6c
i = bit.cast(IntegerBitString)
rpm-build 7e4f6c
rpm-build 7e4f6c
# Will print 258
rpm-build 7e4f6c
print(i.native)
rpm-build 7e4f6c
```
rpm-build 7e4f6c
rpm-build 7e4f6c
## Basic Usage
rpm-build 7e4f6c
rpm-build 7e4f6c
All of the universal types implement four methods, a class method `.load()` and
rpm-build 7e4f6c
the instance methods `.dump()`, `.copy()` and `.debug()`.
rpm-build 7e4f6c
rpm-build 7e4f6c
`.load()` accepts a byte string of DER or BER encoded data and returns an
rpm-build 7e4f6c
object of the class it was called on. `.dump()` returns the serialization of
rpm-build 7e4f6c
an object into DER encoding.
rpm-build 7e4f6c
rpm-build 7e4f6c
```python
rpm-build 7e4f6c
from asn1crypto.core import Sequence
rpm-build 7e4f6c
rpm-build 7e4f6c
parsed = Sequence.load(der_byte_string)
rpm-build 7e4f6c
serialized = parsed.dump()
rpm-build 7e4f6c
```
rpm-build 7e4f6c
rpm-build 7e4f6c
By default, *asn1crypto* tries to be efficient and caches serialized data for
rpm-build 7e4f6c
better performance. If the input data is possibly BER encoded, but the output
rpm-build 7e4f6c
must be DER encoded, the `force` parameter may be used with `.dump()`.
rpm-build 7e4f6c
rpm-build 7e4f6c
```python
rpm-build 7e4f6c
from asn1crypto.core import Sequence
rpm-build 7e4f6c
rpm-build 7e4f6c
parsed = Sequence.load(der_byte_string)
rpm-build 7e4f6c
der_serialized = parsed.dump(force=True)
rpm-build 7e4f6c
```
rpm-build 7e4f6c
rpm-build 7e4f6c
The `.copy()` method creates a deep copy of an object, allowing child fields to
rpm-build 7e4f6c
be modified without affecting the original.
rpm-build 7e4f6c
rpm-build 7e4f6c
```python
rpm-build 7e4f6c
from asn1crypto.core import Sequence
rpm-build 7e4f6c
rpm-build 7e4f6c
seq1 = Sequence.load(der_byte_string)
rpm-build 7e4f6c
seq2 = seq1.copy()
rpm-build 7e4f6c
seq2[0] = seq1[0] + 1
rpm-build 7e4f6c
if seq1[0] != seq2[0]:
rpm-build 7e4f6c
    print('Copies have distinct contents')
rpm-build 7e4f6c
```
rpm-build 7e4f6c
rpm-build 7e4f6c
The `.debug()` method is available to help in situations where interaction with
rpm-build 7e4f6c
another ASN.1 serializer or parsing is not functioning as expected. Calling
rpm-build 7e4f6c
this method will print a tree structure with information about the header bytes,
rpm-build 7e4f6c
class, method, tag, special tagging, content bytes, native Python value, child
rpm-build 7e4f6c
fields and any sub-parsed values.
rpm-build 7e4f6c
rpm-build 7e4f6c
```python
rpm-build 7e4f6c
from asn1crypto.core import Sequence
rpm-build 7e4f6c
rpm-build 7e4f6c
parsed = Sequence.load(der_byte_string)
rpm-build 7e4f6c
parsed.debug()
rpm-build 7e4f6c
```
rpm-build 7e4f6c
rpm-build 7e4f6c
In addition to the available methods, every instance has a `.native` property
rpm-build 7e4f6c
that converts the data into a native Python data type.
rpm-build 7e4f6c
rpm-build 7e4f6c
```python
rpm-build 7e4f6c
import pprint
rpm-build 7e4f6c
from asn1crypto.core import Sequence
rpm-build 7e4f6c
rpm-build 7e4f6c
parsed = Sequence.load(der_byte_string)
rpm-build 7e4f6c
pprint(parsed.native)
rpm-build 7e4f6c
```
rpm-build 7e4f6c
rpm-build 7e4f6c
## Sequence
rpm-build 7e4f6c
rpm-build 7e4f6c
One of the core structures when dealing with ASN.1 is the Sequence type. The
rpm-build 7e4f6c
`Sequence` class can handle field with universal data types, however in most
rpm-build 7e4f6c
situations the `_fields` property will need to be set with the expected
rpm-build 7e4f6c
definition of each field in the Sequence.
rpm-build 7e4f6c
rpm-build 7e4f6c
### Configuration
rpm-build 7e4f6c
rpm-build 7e4f6c
The `_fields` property must be set to a `list` of 2-3 element `tuple`s. The
rpm-build 7e4f6c
first element in the tuple must be a unicode string of the field name. The
rpm-build 7e4f6c
second must be a type class - either a universal type, or a custom type. The
rpm-build 7e4f6c
third, and optional, element is a `dict` with parameters to pass to the type
rpm-build 7e4f6c
class for things like default values, marking the field as optional, or
rpm-build 7e4f6c
implicit/explicit tagging.
rpm-build 7e4f6c
rpm-build 7e4f6c
```python
rpm-build 7e4f6c
from asn1crypto.core import Sequence, Integer, OctetString, IA5String
rpm-build 7e4f6c
rpm-build 7e4f6c
class MySequence(Sequence):
rpm-build 7e4f6c
    _fields = [
rpm-build 7e4f6c
        ('field_one', Integer),
rpm-build 7e4f6c
        ('field_two', OctetString),
rpm-build 7e4f6c
        ('field_three', IA5String, {'optional': True}),
rpm-build 7e4f6c
    ]
rpm-build 7e4f6c
```
rpm-build 7e4f6c
rpm-build 7e4f6c
Implicit and explicit tagging will be covered in more detail later, however
rpm-build 7e4f6c
the following are options that can be set for each field type class:
rpm-build 7e4f6c
rpm-build 7e4f6c
 - `{'default: 1}` sets the field's default value to `1`, allowing it to be
rpm-build 7e4f6c
   omitted from the serialized form
rpm-build 7e4f6c
 - `{'optional': True}` set the field to be optional, allowing it to be
rpm-build 7e4f6c
   omitted
rpm-build 7e4f6c
rpm-build 7e4f6c
### Usage
rpm-build 7e4f6c
rpm-build 7e4f6c
To access values of the sequence, use dict-like access via `[]` and use the
rpm-build 7e4f6c
name of the field:
rpm-build 7e4f6c
rpm-build 7e4f6c
```python
rpm-build 7e4f6c
seq = MySequence.load(der_byte_string)
rpm-build 7e4f6c
print(seq['field_two'].native)
rpm-build 7e4f6c
```
rpm-build 7e4f6c
rpm-build 7e4f6c
The values of fields can be set by assigning via `[]`. If the value assigned is
rpm-build 7e4f6c
of the correct type class, it will be used as-is. If the value is not of the
rpm-build 7e4f6c
correct type class, a new instance of that type class will be created and the
rpm-build 7e4f6c
value will be passed to the constructor.
rpm-build 7e4f6c
rpm-build 7e4f6c
```python
rpm-build 7e4f6c
seq = MySequence.load(der_byte_string)
rpm-build 7e4f6c
# These statements will result in the same state
rpm-build 7e4f6c
seq['field_one'] = Integer(5)
rpm-build 7e4f6c
seq['field_one'] = 5
rpm-build 7e4f6c
```
rpm-build 7e4f6c
rpm-build 7e4f6c
When fields are complex types such as `Sequence` or `SequenceOf`, there is no
rpm-build 7e4f6c
way to construct the value out of a native Python data type.
rpm-build 7e4f6c
rpm-build 7e4f6c
### Optional Fields
rpm-build 7e4f6c
rpm-build 7e4f6c
When a field is configured via the `optional` parameter, not present in the
rpm-build 7e4f6c
`Sequence`, but accessed, the `VOID` object will be returned. This is an object
rpm-build 7e4f6c
that is serialized to an empty byte string and returns `None` when `.native` is
rpm-build 7e4f6c
accessed.
rpm-build 7e4f6c
rpm-build 7e4f6c
## Set
rpm-build 7e4f6c
rpm-build 7e4f6c
The `Set` class is configured in the same was as `Sequence`, however it allows
rpm-build 7e4f6c
serialized fields to be in any order, per the ASN.1 standard.
rpm-build 7e4f6c
rpm-build 7e4f6c
```python
rpm-build 7e4f6c
from asn1crypto.core import Set, Integer, OctetString, IA5String
rpm-build 7e4f6c
rpm-build 7e4f6c
class MySet(Set):
rpm-build 7e4f6c
    _fields = [
rpm-build 7e4f6c
        ('field_one', Integer),
rpm-build 7e4f6c
        ('field_two', OctetString),
rpm-build 7e4f6c
        ('field_three', IA5String, {'optional': True}),
rpm-build 7e4f6c
    ]
rpm-build 7e4f6c
```
rpm-build 7e4f6c
rpm-build 7e4f6c
## SequenceOf
rpm-build 7e4f6c
rpm-build 7e4f6c
The `SequenceOf` class is used to allow for zero or more instances of a type.
rpm-build 7e4f6c
The class uses the `_child_spec` property to define the instance class type.
rpm-build 7e4f6c
rpm-build 7e4f6c
```python
rpm-build 7e4f6c
from asn1crypto.core import SequenceOf, Integer
rpm-build 7e4f6c
rpm-build 7e4f6c
class Integers(SequenceOf):
rpm-build 7e4f6c
    _child_spec = Integer
rpm-build 7e4f6c
```
rpm-build 7e4f6c
rpm-build 7e4f6c
Values in the `SequenceOf` can be accessed via `[]` with an integer key. The
rpm-build 7e4f6c
length of the `SequenceOf` is determined via `len()`.
rpm-build 7e4f6c
rpm-build 7e4f6c
```python
rpm-build 7e4f6c
values = Integers.load(der_byte_string)
rpm-build 7e4f6c
for i in range(0, len(values)):
rpm-build 7e4f6c
    print(values[i].native)
rpm-build 7e4f6c
```
rpm-build 7e4f6c
rpm-build 7e4f6c
## SetOf
rpm-build 7e4f6c
rpm-build 7e4f6c
The `SetOf` class is an exact duplicate of `SequenceOf`. According to the ASN.1
rpm-build 7e4f6c
standard, the difference is that a `SequenceOf` is explicitly ordered, however
rpm-build 7e4f6c
`SetOf` may be in any order. This is an equivalent comparison of a Python `list`
rpm-build 7e4f6c
and `set`.
rpm-build 7e4f6c
rpm-build 7e4f6c
```python
rpm-build 7e4f6c
from asn1crypto.core import SetOf, Integer
rpm-build 7e4f6c
rpm-build 7e4f6c
class Integers(SetOf):
rpm-build 7e4f6c
    _child_spec = Integer
rpm-build 7e4f6c
```
rpm-build 7e4f6c
rpm-build 7e4f6c
## Integer
rpm-build 7e4f6c
rpm-build 7e4f6c
The `Integer` class allows values to be *named*. An `Integer` with named values
rpm-build 7e4f6c
may contain any integer, however special values with named will be represented
rpm-build 7e4f6c
as those names when `.native` is called.
rpm-build 7e4f6c
rpm-build 7e4f6c
Named values are configured via the `_map` property, which must be a `dict`
rpm-build 7e4f6c
with the keys being integers and the values being unicode strings.
rpm-build 7e4f6c
rpm-build 7e4f6c
```python
rpm-build 7e4f6c
from asn1crypto.core import Integer
rpm-build 7e4f6c
rpm-build 7e4f6c
class Version(Integer):
rpm-build 7e4f6c
    _map = {
rpm-build 7e4f6c
        1: 'v1',
rpm-build 7e4f6c
        2: 'v2',
rpm-build 7e4f6c
    }
rpm-build 7e4f6c
rpm-build 7e4f6c
# Will print: "v1"
rpm-build 7e4f6c
print(Version(1).native)
rpm-build 7e4f6c
rpm-build 7e4f6c
# Will print: 4
rpm-build 7e4f6c
print(Version(4).native)
rpm-build 7e4f6c
```
rpm-build 7e4f6c
rpm-build 7e4f6c
## Enumerated
rpm-build 7e4f6c
rpm-build 7e4f6c
The `Enumerated` class is almost identical to `Integer`, however only values in
rpm-build 7e4f6c
the `_map` property are valid.
rpm-build 7e4f6c
rpm-build 7e4f6c
```python
rpm-build 7e4f6c
from asn1crypto.core import Enumerated
rpm-build 7e4f6c
rpm-build 7e4f6c
class Version(Enumerated):
rpm-build 7e4f6c
    _map = {
rpm-build 7e4f6c
        1: 'v1',
rpm-build 7e4f6c
        2: 'v2',
rpm-build 7e4f6c
    }
rpm-build 7e4f6c
rpm-build 7e4f6c
# Will print: "v1"
rpm-build 7e4f6c
print(Version(1).native)
rpm-build 7e4f6c
rpm-build 7e4f6c
# Will raise a ValueError exception
rpm-build 7e4f6c
print(Version(4).native)
rpm-build 7e4f6c
```
rpm-build 7e4f6c
rpm-build 7e4f6c
## ObjectIdentifier
rpm-build 7e4f6c
rpm-build 7e4f6c
The `ObjectIdentifier` class represents values of the ASN.1 type of the same
rpm-build 7e4f6c
name. `ObjectIdentifier` instances are converted to a unicode string in a
rpm-build 7e4f6c
dotted-integer format when `.native` is accessed.
rpm-build 7e4f6c
rpm-build 7e4f6c
While this standard conversion is a reasonable baseline, in most situations
rpm-build 7e4f6c
it will be more maintainable to map the OID strings to a unicode string
rpm-build 7e4f6c
containing a description of what the OID repesents.
rpm-build 7e4f6c
rpm-build 7e4f6c
The mapping of OID strings to name strings is configured via the `_map`
rpm-build 7e4f6c
property, which is a `dict` object with keys being unicode OID string and the
rpm-build 7e4f6c
values being a unicode string.
rpm-build 7e4f6c
rpm-build 7e4f6c
The `.dotted` attribute will always return a unicode string of the dotted
rpm-build 7e4f6c
integer form of the OID.
rpm-build 7e4f6c
rpm-build 7e4f6c
The class methods `.map()` and `.unmap()` will convert a dotted integer unicode
rpm-build 7e4f6c
string to the user-friendly name, and vice-versa.
rpm-build 7e4f6c
rpm-build 7e4f6c
```python
rpm-build 7e4f6c
from asn1crypto.core import ObjectIdentifier
rpm-build 7e4f6c
rpm-build 7e4f6c
class MyType(ObjectIdentifier):
rpm-build 7e4f6c
    _map = {
rpm-build 7e4f6c
        '1.8.2.1.23': 'value_name',
rpm-build 7e4f6c
        '1.8.2.1.24': 'other_value',
rpm-build 7e4f6c
    }
rpm-build 7e4f6c
rpm-build 7e4f6c
# Will print: "value_name"
rpm-build 7e4f6c
print(MyType('1.8.2.1.23').native)
rpm-build 7e4f6c
rpm-build 7e4f6c
# Will print: "1.8.2.1.23"
rpm-build 7e4f6c
print(MyType('1.8.2.1.23').dotted)
rpm-build 7e4f6c
rpm-build 7e4f6c
# Will print: "1.8.2.1.25"
rpm-build 7e4f6c
print(MyType('1.8.2.1.25').native)
rpm-build 7e4f6c
rpm-build 7e4f6c
# Will print "value_name"
rpm-build 7e4f6c
print(MyType.map('1.8.2.1.23'))
rpm-build 7e4f6c
rpm-build 7e4f6c
# Will print "1.8.2.1.23"
rpm-build 7e4f6c
print(MyType.unmap('value_name'))
rpm-build 7e4f6c
```
rpm-build 7e4f6c
rpm-build 7e4f6c
## BitString
rpm-build 7e4f6c
rpm-build 7e4f6c
When no `_map` is set for a `BitString` class, the native representation is a
rpm-build 7e4f6c
`tuple` of `int`s (being either `1` or `0`).
rpm-build 7e4f6c
rpm-build 7e4f6c
```python
rpm-build 7e4f6c
from asn1crypto.core import BitString
rpm-build 7e4f6c
rpm-build 7e4f6c
b1 = BitString((1, 0, 1))
rpm-build 7e4f6c
```
rpm-build 7e4f6c
rpm-build 7e4f6c
Additionally, it is possible to set the `_map` property to a dict where the
rpm-build 7e4f6c
keys are bit indexes and the values are unicode string names. This allows
rpm-build 7e4f6c
checking the value of a given bit by item access, and the native representation
rpm-build 7e4f6c
becomes a `set` of unicode strings.
rpm-build 7e4f6c
rpm-build 7e4f6c
```python
rpm-build 7e4f6c
from asn1crypto.core import BitString
rpm-build 7e4f6c
rpm-build 7e4f6c
class MyFlags(BitString):
rpm-build 7e4f6c
    _map = {
rpm-build 7e4f6c
        0: 'edit',
rpm-build 7e4f6c
        1: 'delete',
rpm-build 7e4f6c
        2: 'manage_users',
rpm-build 7e4f6c
    }
rpm-build 7e4f6c
rpm-build 7e4f6c
permissions = MyFlags({'edit', 'delete'})
rpm-build 7e4f6c
rpm-build 7e4f6c
# This will be printed
rpm-build 7e4f6c
if permissions['edit'] and permissions['delete']:
rpm-build 7e4f6c
    print('Can edit and delete')
rpm-build 7e4f6c
rpm-build 7e4f6c
# This will not
rpm-build 7e4f6c
if 'manage_users' in permissions.native:
rpm-build 7e4f6c
    print('Is admin')
rpm-build 7e4f6c
```
rpm-build 7e4f6c
rpm-build 7e4f6c
## Strings
rpm-build 7e4f6c
rpm-build 7e4f6c
ASN.1 contains quite a number of string types:
rpm-build 7e4f6c
rpm-build 7e4f6c
| Type              | Standard Encoding                 | Implementation Encoding | Notes                                                                     |
rpm-build 7e4f6c
| ----------------- | --------------------------------- | ----------------------- | ------------------------------------------------------------------------- |
rpm-build 7e4f6c
| `UTF8String`      | UTF-8                             | UTF-8                   |                                                                           |
rpm-build 7e4f6c
| `NumericString`   | ASCII `[0-9 ]`                    | ISO 8859-1              | The implementation is a superset of supported characters                  |
rpm-build 7e4f6c
| `PrintableString` | ASCII `[a-zA-Z0-9 '()+,\\-./:=?]` | ISO 8859-1              | The implementation is a superset of supported characters                  |
rpm-build 7e4f6c
| `TeletexString`   | ITU T.61                          | Custom                  | The implementation is based off of https://en.wikipedia.org/wiki/ITU_T.61 |
rpm-build 7e4f6c
| `VideotexString`  | *?*                               | *None*                  | This has no set encoding, and it not used in cryptography                 |
rpm-build 7e4f6c
| `IA5String`       | ITU T.50 (very similar to ASCII)  | ISO 8859-1              | The implementation is a superset of supported characters                  |
rpm-build 7e4f6c
| `GraphicString`   | *                                 | ISO 8859-1              | This has not set encoding, but seems to often contain ISO 8859-1          |
rpm-build 7e4f6c
| `VisibleString`   | ASCII (printable)                 | ISO 8859-1              | The implementation is a superset of supported characters                  |
rpm-build 7e4f6c
| `GeneralString`   | *                                 | ISO 8859-1              | This has not set encoding, but seems to often contain ISO 8859-1          |
rpm-build 7e4f6c
| `UniversalString` | UTF-32                            | UTF-32                  |                                                                           |
rpm-build 7e4f6c
| `CharacterString` | *                                 | ISO 8859-1              | This has not set encoding, but seems to often contain ISO 8859-1          |
rpm-build 7e4f6c
| `BMPString`       | UTF-16                            | UTF-16                  |                                                                           |
rpm-build 7e4f6c
rpm-build 7e4f6c
As noted in the table above, many of the implementations are supersets of the
rpm-build 7e4f6c
supported characters. This simplifies parsing, but puts the onus of using valid
rpm-build 7e4f6c
characters on the developer. However, in general `UTF8String`, `BMPString` or
rpm-build 7e4f6c
`UniversalString` should be preferred when a choice is given.
rpm-build 7e4f6c
rpm-build 7e4f6c
All string types other than `VideotexString` are created from unicode strings.
rpm-build 7e4f6c
rpm-build 7e4f6c
```python
rpm-build 7e4f6c
from asn1crypto.core import IA5String
rpm-build 7e4f6c
rpm-build 7e4f6c
print(IA5String('Testing!').native)
rpm-build 7e4f6c
```
rpm-build 7e4f6c
rpm-build 7e4f6c
## UTCTime
rpm-build 7e4f6c
rpm-build 7e4f6c
The class `UTCTime` accepts a unicode string in one of the formats:
rpm-build 7e4f6c
rpm-build 7e4f6c
 - `%y%m%d%H%MZ`
rpm-build 7e4f6c
 - `%y%m%d%H%M%SZ`
rpm-build 7e4f6c
 - `%y%m%d%H%M%z`
rpm-build 7e4f6c
 - `%y%m%d%H%M%S%z`
rpm-build 7e4f6c
rpm-build 7e4f6c
or a `datetime.datetime` instance. See the
rpm-build 7e4f6c
[Python datetime strptime() reference](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior)
rpm-build 7e4f6c
for details of the formats.
rpm-build 7e4f6c
rpm-build 7e4f6c
When `.native` is accessed, it returns a `datetime.datetime` object with a
rpm-build 7e4f6c
`tzinfo` of `asn1crypto.util.timezone.utc`.
rpm-build 7e4f6c
rpm-build 7e4f6c
## GeneralizedTime
rpm-build 7e4f6c
rpm-build 7e4f6c
The class `GeneralizedTime` accepts a unicode string in one of the formats:
rpm-build 7e4f6c
rpm-build 7e4f6c
 - `%Y%m%d%H`
rpm-build 7e4f6c
 - `%Y%m%d%H%M`
rpm-build 7e4f6c
 - `%Y%m%d%H%M%S`
rpm-build 7e4f6c
 - `%Y%m%d%H%M%S.%f`
rpm-build 7e4f6c
 - `%Y%m%d%HZ`
rpm-build 7e4f6c
 - `%Y%m%d%H%MZ`
rpm-build 7e4f6c
 - `%Y%m%d%H%M%SZ`
rpm-build 7e4f6c
 - `%Y%m%d%H%M%S.%fZ`
rpm-build 7e4f6c
 - `%Y%m%d%H%z`
rpm-build 7e4f6c
 - `%Y%m%d%H%M%z`
rpm-build 7e4f6c
 - `%Y%m%d%H%M%S%z`
rpm-build 7e4f6c
 - `%Y%m%d%H%M%S.%f%z`
rpm-build 7e4f6c
rpm-build 7e4f6c
or a `datetime.datetime` instance. See the
rpm-build 7e4f6c
[Python datetime strptime() reference](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior)
rpm-build 7e4f6c
for details of the formats.
rpm-build 7e4f6c
rpm-build 7e4f6c
When `.native` is accessed, it returns a `datetime.datetime` object with a
rpm-build 7e4f6c
`tzinfo` of `asn1crypto.util.timezone.utc`. For formats where the time has a
rpm-build 7e4f6c
timezone offset is specified (`[+-]\d{4}`), the time is converted to UTC. For
rpm-build 7e4f6c
times without a timezone, the time is assumed to be in UTC.
rpm-build 7e4f6c
rpm-build 7e4f6c
## Choice
rpm-build 7e4f6c
rpm-build 7e4f6c
The `Choice` class allows handling ASN.1 Choice structures. The `_alternatives`
rpm-build 7e4f6c
property must be set to a `list` containing 2-3 element `tuple`s. The first
rpm-build 7e4f6c
element in the tuple is the alternative name. The second element is the type
rpm-build 7e4f6c
class for the alternative. The, optional, third element is a `dict` of
rpm-build 7e4f6c
parameters to pass to the type class constructor. This is used primarily for
rpm-build 7e4f6c
implicit and explicit tagging.
rpm-build 7e4f6c
rpm-build 7e4f6c
```python
rpm-build 7e4f6c
from asn1crypto.core import Choice, Integer, OctetString, IA5String
rpm-build 7e4f6c
rpm-build 7e4f6c
class MyChoice(Choice):
rpm-build 7e4f6c
    _alternatives = [
rpm-build 7e4f6c
        ('option_one', Integer),
rpm-build 7e4f6c
        ('option_two', OctetString),
rpm-build 7e4f6c
        ('option_three', IA5String),
rpm-build 7e4f6c
    ]
rpm-build 7e4f6c
```
rpm-build 7e4f6c
rpm-build 7e4f6c
`Choice` objects has two extra properties, `.name` and `.chosen`. The `.name`
rpm-build 7e4f6c
property contains the name of the chosen alternative. The `.chosen` property
rpm-build 7e4f6c
contains the instance of the chosen type class.
rpm-build 7e4f6c
rpm-build 7e4f6c
```python
rpm-build 7e4f6c
parsed = MyChoice.load(der_bytes)
rpm-build 7e4f6c
print(parsed.name)
rpm-build 7e4f6c
print(type(parsed.chosen))
rpm-build 7e4f6c
```
rpm-build 7e4f6c
rpm-build 7e4f6c
The `.native` property and `.dump()` method work as with the universal type
rpm-build 7e4f6c
classes. Under the hood they just proxy the calls to the `.chosen` object.
rpm-build 7e4f6c
rpm-build 7e4f6c
## Any
rpm-build 7e4f6c
rpm-build 7e4f6c
The `Any` class implements the ASN.1 Any type, which allows any data type. By
rpm-build 7e4f6c
default objects of this class do not perform any parsing. However, the
rpm-build 7e4f6c
`.parse()` instance method allows parsing the contents of the `Any` object,
rpm-build 7e4f6c
either into a universal type, or to a specification pass in via the `spec`
rpm-build 7e4f6c
parameter.
rpm-build 7e4f6c
rpm-build 7e4f6c
This type is not used as a top-level structure, but instead allows `Sequence`
rpm-build 7e4f6c
and `Set` objects to accept varying contents, usually based on some sort of
rpm-build 7e4f6c
`ObjectIdentifier`.
rpm-build 7e4f6c
rpm-build 7e4f6c
```python
rpm-build 7e4f6c
from asn1crypto.core import Sequence, ObjectIdentifier, Any, Integer, OctetString
rpm-build 7e4f6c
rpm-build 7e4f6c
class MySequence(Sequence):
rpm-build 7e4f6c
    _fields = [
rpm-build 7e4f6c
        ('type', ObjectIdentifier),
rpm-build 7e4f6c
        ('value', Any),
rpm-build 7e4f6c
    ]
rpm-build 7e4f6c
```
rpm-build 7e4f6c
rpm-build 7e4f6c
## Specification via OID
rpm-build 7e4f6c
rpm-build 7e4f6c
Throughout the usage of ASN.1 in cryptography, a pattern is present where an
rpm-build 7e4f6c
`ObjectIdenfitier` is used to determine what specification should be used to
rpm-build 7e4f6c
interpret another field in a `Sequence`. Usually the other field is an instance
rpm-build 7e4f6c
of `Any`, however occasionally it is an `OctetString` or `OctetBitString`.
rpm-build 7e4f6c
rpm-build 7e4f6c
*asn1crypto* provides the `_oid_pair` and `_oid_specs` properties of the
rpm-build 7e4f6c
`Sequence` class to allow handling these situations.
rpm-build 7e4f6c
rpm-build 7e4f6c
The `_oid_pair` is a tuple with two unicode string elements. The first is the
rpm-build 7e4f6c
name of the field that is an `ObjectIdentifier` and the second if the name of
rpm-build 7e4f6c
the field that has a variable specification based on the first field. *In
rpm-build 7e4f6c
situations where the value field should be an `OctetString` or `OctetBitString`,
rpm-build 7e4f6c
`ParsableOctetString` and `ParsableOctetBitString` will need to be used instead
rpm-build 7e4f6c
to allow for the sub-parsing of the contents.*
rpm-build 7e4f6c
rpm-build 7e4f6c
The `_oid_specs` property is a `dict` object with `ObjectIdentifier` values as
rpm-build 7e4f6c
the keys (either dotted or mapped notation) and a type class as the value. When
rpm-build 7e4f6c
the first field in `_oid_pair` has a value equal to one of the keys in
rpm-build 7e4f6c
`_oid_specs`, then the corresponding type class will be used as the
rpm-build 7e4f6c
specification for the second field of `_oid_pair`.
rpm-build 7e4f6c
rpm-build 7e4f6c
```python
rpm-build 7e4f6c
from asn1crypto.core import Sequence, ObjectIdentifier, Any, OctetString, Integer
rpm-build 7e4f6c
rpm-build 7e4f6c
class MyId(ObjectIdentifier):
rpm-build 7e4f6c
    _map = {
rpm-build 7e4f6c
        '1.2.3.4': 'initialization_vector',
rpm-build 7e4f6c
        '1.2.3.5': 'iterations',
rpm-build 7e4f6c
    }
rpm-build 7e4f6c
rpm-build 7e4f6c
class MySequence(Sequence):
rpm-build 7e4f6c
    _fields = [
rpm-build 7e4f6c
        ('type', MyId),
rpm-build 7e4f6c
        ('value', Any),
rpm-build 7e4f6c
    ]
rpm-build 7e4f6c
rpm-build 7e4f6c
    _oid_pair = ('type', 'value')
rpm-build 7e4f6c
    _oid_specs = {
rpm-build 7e4f6c
        'initialization_vector': OctetString,
rpm-build 7e4f6c
        'iterations': Integer,
rpm-build 7e4f6c
    }
rpm-build 7e4f6c
```
rpm-build 7e4f6c
rpm-build 7e4f6c
## Explicit and Implicit Tagging
rpm-build 7e4f6c
rpm-build 7e4f6c
When working with `Sequence`, `Set` and `Choice` it is often necessary to
rpm-build 7e4f6c
disambiguate between fields because of a number of factors:
rpm-build 7e4f6c
rpm-build 7e4f6c
 - In `Sequence` the presence of an optional field must be determined by tag number
rpm-build 7e4f6c
 - In `Set`, each field must have a different tag number since they can be in any order
rpm-build 7e4f6c
 - In `Choice`, each alternative must have a different tag number to determine which is present
rpm-build 7e4f6c
rpm-build 7e4f6c
The universal types all have unique tag numbers. However, if a `Sequence`, `Set`
rpm-build 7e4f6c
or `Choice` has more than one field with the same universal type, tagging allows
rpm-build 7e4f6c
a way to keep the semantics of the original type, but with a different tag
rpm-build 7e4f6c
number.
rpm-build 7e4f6c
rpm-build 7e4f6c
Implicit tagging simply changes the tag number of a type to a different value.
rpm-build 7e4f6c
However, Explicit tagging wraps the existing type in another tag with the
rpm-build 7e4f6c
specified tag number.
rpm-build 7e4f6c
rpm-build 7e4f6c
In general, most situations allow for implicit tagging, with the notable
rpm-build 7e4f6c
exception than a field that is a `Choice` type must always be explicitly tagged.
rpm-build 7e4f6c
Otherwise, using implicit tagging would modify the tag of the chosen
rpm-build 7e4f6c
alternative, breaking the mechanism by which `Choice` works.
rpm-build 7e4f6c
rpm-build 7e4f6c
Here is an example of implicit and explicit tagging where explicit tagging on
rpm-build 7e4f6c
the `Sequence` allows a `Choice` type field to be optional, and where implicit
rpm-build 7e4f6c
tagging in the `Choice` structure allows disambiguating between two string of
rpm-build 7e4f6c
the same type.
rpm-build 7e4f6c
rpm-build 7e4f6c
```python
rpm-build 7e4f6c
from asn1crypto.core import Sequence, Choice, IA5String, UTCTime, ObjectIdentifier
rpm-build 7e4f6c
rpm-build 7e4f6c
class Person(Choice):
rpm-build 7e4f6c
    _alternatives = [
rpm-build 7e4f6c
        ('name', IA5String),
rpm-build 7e4f6c
        ('email', IA5String, {'implicit': 0}),
rpm-build 7e4f6c
    ]
rpm-build 7e4f6c
rpm-build 7e4f6c
class Record(Sequence):
rpm-build 7e4f6c
    _fields = [
rpm-build 7e4f6c
        ('id', ObjectIdentifier),
rpm-build 7e4f6c
        ('created', UTCTime),
rpm-build 7e4f6c
        ('creator', Person, {'explicit': 0, 'optional': True}),
rpm-build 7e4f6c
    ]
rpm-build 7e4f6c
```
rpm-build 7e4f6c
rpm-build 7e4f6c
As is shown above, the keys `implicit` and `explicit` are used for tagging,
rpm-build 7e4f6c
and are passed to a type class constructor via the optional third element of
rpm-build 7e4f6c
a field or alternative tuple. Both parameters may be an integer tag number, or
rpm-build 7e4f6c
a 2-element tuple of string class name and integer tag.
rpm-build 7e4f6c
rpm-build 7e4f6c
If a tagging value needs its tagging changed, the `.untag()` method can be used
rpm-build 7e4f6c
to create a copy of the object without explicit/implicit tagging. The `.retag()`
rpm-build 7e4f6c
method can be used to change the tagging. This method accepts one parameter, a
rpm-build 7e4f6c
dict with either or both of the keys `implicit` and `explicit`.
rpm-build 7e4f6c
rpm-build 7e4f6c
```python
rpm-build 7e4f6c
person = Person(name='email', value='will@wbond.net')
rpm-build 7e4f6c
rpm-build 7e4f6c
# Will display True
rpm-build 7e4f6c
print(person.implicit)
rpm-build 7e4f6c
rpm-build 7e4f6c
# Will display False
rpm-build 7e4f6c
print(person.untag().implicit)
rpm-build 7e4f6c
rpm-build 7e4f6c
# Will display 0
rpm-build 7e4f6c
print(person.tag)
rpm-build 7e4f6c
rpm-build 7e4f6c
# Will display 1
rpm-build 7e4f6c
print(person.retag({'implicit': 1}).tag)
rpm-build 7e4f6c
```