Series.str.extract(pat, flags=0, expand=None) [source]
For each subject string in the Series, extract groups from the first match of regular expression pat.
New in version 0.13.0.
| Parameters: |
pat : string Regular expression pattern with capturing groups flags : int, default 0 (no flags) re module flags, e.g. re.IGNORECASE .. versionadded:: 0.18.0 expand : bool, default False
|
|---|---|
| Returns: |
DataFrame with one row for each subject string, and one column for each group. Any capture group names in regular expression pat will be used for column names; otherwise capture group numbers will be used. The dtype of each result column is always object, even when no match is found. If expand=False and pat has only one capture group, then return a Series (if subject is a Series) or Index (if subject is an Index). |
See also
extractall
A pattern with two groups will return a DataFrame with two columns. Non-matches will be NaN.
>>> s = Series(['a1', 'b2', 'c3'])
>>> s.str.extract('([ab])(\d)')
0 1
0 a 1
1 b 2
2 NaN NaN
A pattern may contain optional groups.
>>> s.str.extract('([ab])?(\d)')
0 1
0 a 1
1 b 2
2 NaN 3
Named groups will become column names in the result.
>>> s.str.extract('(?P<letter>[ab])(?P<digit>\d)')
letter digit
0 a 1
1 b 2
2 NaN NaN
A pattern with one group will return a DataFrame with one column if expand=True.
>>> s.str.extract('[ab](\d)', expand=True)
0
0 1
1 2
2 NaN
A pattern with one group will return a Series if expand=False.
>>> s.str.extract('[ab](\d)', expand=False)
0 1
1 2
2 NaN
dtype: object
© 2011–2012 Lambda Foundry, Inc. and PyData Development Team
© 2008–2011 AQR Capital Management, LLC
© 2008–2014 the pandas development team
Licensed under the 3-clause BSD License.
http://pandas.pydata.org/pandas-docs/version/0.19.2/generated/pandas.Series.str.extract.html