Index: Doc/c-api/unicode.rst =================================================================== --- Doc/c-api/unicode.rst (revision 71700) +++ Doc/c-api/unicode.rst (working copy) @@ -209,6 +209,92 @@ buffer, *NULL* if *unicode* is not a Unicode object. +.. cfunction:: PyObject* PyUnicode_FromString(const char *u) + + Return a Unicode Object from the c-string buffer *u* on success, or *NULL* on + failure. The parameter *u* must not be *NULL*; it will not be + checked. + + +.. cfunction:: PyObject* PyUnicode_FromStringAndSize(const char *u, Py_ssize_t size) + + Create a Unicode Object from the c-string buffer *u* of the given size. *u* + may be *NULL* which causes the contents to be undefined. It is the user's + responsibility to fill in the needed data. The buffer is copied into the new + object. If the buffer is not *NULL*, the return value might be a shared object. + Therefore, modification of the resulting Unicode object is only allowed when *u* + is *NULL*. + + +.. cfunction:: PyObject* PyUnicode_FromFormat(const char *format, ...) + + Take a C :cfunc:`printf`\ -style *format* string and a variable number of + arguments, calculate the size of the resulting Python string and return a + unicode object with the values formatted into it. The variable arguments + must be C types and must correspond exactly to the format characters in the + *format* string. The following format characters are allowed: + + .. % This should be exactly the same as the table in PyErr_Format. + .. % One should just refer to the other. + .. % The descriptions for %zd and %zu are wrong, but the truth is complicated + .. % because not all compilers support the %z width modifier -- we fake it + .. % when necessary via interpolating PY_FORMAT_SIZE_T. + .. % %u, %lu, %zu should have "new in Python 2.5" blurbs. + + +-------------------+---------------+--------------------------------+ + | Format Characters | Type | Comment | + +===================+===============+================================+ + | :attr:`%%` | *n/a* | The literal % character. | + +-------------------+---------------+--------------------------------+ + | :attr:`%c` | int | A single character, | + | | | represented as an C int. | + +-------------------+---------------+--------------------------------+ + | :attr:`%d` | int | Exactly equivalent to | + | | | ``printf("%d")``. | + +-------------------+---------------+--------------------------------+ + | :attr:`%u` | unsigned int | Exactly equivalent to | + | | | ``printf("%u")``. | + +-------------------+---------------+--------------------------------+ + | :attr:`%ld` | long | Exactly equivalent to | + | | | ``printf("%ld")``. | + +-------------------+---------------+--------------------------------+ + | :attr:`%lu` | unsigned long | Exactly equivalent to | + | | | ``printf("%lu")``. | + +-------------------+---------------+--------------------------------+ + | :attr:`%zd` | Py_ssize_t | Exactly equivalent to | + | | | ``printf("%zd")``. | + +-------------------+---------------+--------------------------------+ + | :attr:`%zu` | size_t | Exactly equivalent to | + | | | ``printf("%zu")``. | + +-------------------+---------------+--------------------------------+ + | :attr:`%i` | int | Exactly equivalent to | + | | | ``printf("%i")``. | + +-------------------+---------------+--------------------------------+ + | :attr:`%x` | int | Exactly equivalent to | + | | | ``printf("%x")``. | + +-------------------+---------------+--------------------------------+ + | :attr:`%s` | char\* | A null-terminated C character | + | | | array. | + +-------------------+---------------+--------------------------------+ + | :attr:`%p` | void\* | The hex representation of a C | + | | | pointer. Mostly equivalent to | + | | | ``printf("%p")`` except that | + | | | it is guaranteed to start with | + | | | the literal ``0x`` regardless | + | | | of what the platform's | + | | | ``printf`` yields. | + +-------------------+---------------+--------------------------------+ + + An unrecognized format character causes all the rest of the format string to be + copied as-is to the result string, and any extra arguments discarded. + + +.. cfunction:: PyObject* PyUnicode_FromFormatV(const char *format, va_list vargs) + + Identical to :cfunc:`PyUnicode_FromFormat` except that it takes exactly two + arguments. + + .. cfunction:: Py_ssize_t PyUnicode_GetSize(PyObject *unicode) Return the length of the Unicode object. @@ -716,6 +802,14 @@ set. Separators are not included in the resulting list. +.. cfunction:: PyObject* PyUnicode_RSplit(PyObject *s, PyObject *sep, Py_ssize_t maxsplit) + + Split a string giving a list of Unicode strings. If sep is *NULL*, splitting + will be done at all whitespace substrings. Otherwise, splits occur at the given + separator. At most *maxsplit* splits will be done, the *rightmost* ones. If negative, + no limit is set. Separators are not included in the resulting list. + + .. cfunction:: PyObject* PyUnicode_Splitlines(PyObject *s, int keepend) Split a Unicode string at line breaks, returning a list of Unicode strings. @@ -723,6 +817,22 @@ characters are not included in the resulting strings. +.. cfunction:: PyObject* PyUnicode_Partition(PyObject *str, PyObject *sep) + + Split the string at the first occurrence of *sep*, and return a 3-tuple + containing the part before the separator, the separator itself, and the part + after the separator. If the separator is not found, return a 3-tuple containing + the string itself, followed by two empty strings. + + +.. cfunction:: PyObject* PyUnicode_RPartition(PyObject *str, PyObject *sep) + + Split the string at the last occurrence of *sep*, and return a 3-tuple + containing the part before the separator, the separator itself, and the part + after the separator. If the separator is not found, return a 3-tuple containing + two empty strings, followed by the string itself. + + .. cfunction:: PyObject* PyUnicode_Translate(PyObject *str, PyObject *table, const char *errors) Translate a string by applying a character mapping table to it and return the Index: Doc/data/refcounts.dat =================================================================== --- Doc/data/refcounts.dat (revision 71700) +++ Doc/data/refcounts.dat (working copy) @@ -1405,6 +1405,21 @@ PyUnicode_AsUnicode:Py_UNICODE*::: PyUnicode_AsUnicode:PyObject :*unicode:0: +PyUnicode_FromString:PyObject*::+1: +PyUnicode_FromString:const char*:u:: + +PyUnicode_FromStringAndSize:PyObject*::+1: +PyUnicode_FromStringAndSize:const char*:u:: +PyUnicode_FromStringAndSize:int:size:: + +PyUnicode_FromFormat:PyObject*::+1: +PyUnicode_FromFormat:const char*:format:: +PyUnicode_FromFormat::...:: + +PyUnicode_FromFormatV:PyObject*::+1: +PyUnicode_FromFormatV:const char*:format:: +PyUnicode_FromFormatV:va_list:vargs:: + PyUnicode_GetSize:int::: PyUnicode_GetSize:PyObject :*unicode:0: @@ -1579,10 +1594,23 @@ PyUnicode_Split:PyObject*:right:0: PyUnicode_Split:int:maxsplit:: +PyUnicode_RSplit:PyObject*::+1: +PyUnicode_RSplit:PyObject*:left:0: +PyUnicode_RSplit:PyObject*:right:0: +PyUnicode_RSplit:int:maxsplit:: + PyUnicode_Splitlines:PyObject*::+1: PyUnicode_Splitlines:PyObject*:s:0: PyUnicode_Splitlines:int:maxsplit:: +PyUnicode_Partition:PyObject*::+1: +PyUnicode_Partition:PyObject*:str:0: +PyUnicode_Partition:PyObject*:sep:0: + +PyUnicode_RPartition:PyObject*::+1: +PyUnicode_RPartition:PyObject*:str:0: +PyUnicode_RPartition:PyObject*:sep:0: + PyUnicode_Translate:PyObject*::+1: PyUnicode_Translate:PyObject*:str:0: PyUnicode_Translate:PyObject*:table:0: