View on GitHub

PythonTipsAndTricks

PTT

Series (ps)

REGEX

REGEX search and replacement

ps.str.contains(REGEX).str.replace(r'...(REGEX)...',r'...\1...')

apply method

apply template

for a list of dictionaries

pd.apply( lambda l: [d if d else d for d in x] if l else l)

Series: Obtain the values of a key in a column of dictionaries:

ps.dict_column.apply(lambda x: x.get('key'))

Before some sanity check need to be done. Either replace all NaN values in the Series with empty python dict objects

>>> ps=df.R(lambda x: {} if pd.isnull(x) else x)
>>> frame
                    Q          R
0           {2: 2010}  {1: 2013}
1  {2: 2011, 3: 2009}         {}

Or by filter out the NaN values

(ps[~ps.dict_column.isna()]).dict_column.apply(lambda x: x.get('key'))

Filter and combain dictionary values from a Series con a list of dictionaries with the same keys.

ps.apply(lambda x: [ str( d.get('key1'))+' '+str(d.get('key2')) for d  in x] )

How to apply a function to two columns of Pandas dataframe

See also stackoverflow

kk['lvr']=kk['SO'].str.lower().str.strip().map(un.unidecode).combine(\
  kk['SJR_Title'].str.lower().str.strip().map(un.unidecode) , func=lv.ratio)

Flatten column with lists of lists into a single list:

See: https://stackoverflow.com/a/38896038/2268280

df['col'].apply(pd.Series).stack().unique()
def func(l):
    for i in range(len(l)):
        l[i].get('key')='NEW VALUE'
     ...
     return l
>>> df['col'].apply(func)

Prepare filer for a value of a key for a list of dictionaries

df['column'].apply(lambda l: [d.get('key')==value for d in l] 
       if l else [False]).apply(lambda l: True in l)

Other

Change type

For example for str to something else

ps.str.replace('^$','0').astype(TYPE)

with type: float, int,...

Series: Obtain first value of a list after str.split(pattern)

ps.título.str.lower().map(unidecode).str.split('(').str[0]

Select the first two words from a colum text

s=' '.join(ext.UDEA_título.str.lower().str.replace(' ',':: ').str.split('::').str[:2].loc[i])

map upon series with a parameter

ps.map(lambda x: lv.ratio(x,parameter))

Plot histrogram

ps.value_counts().plot(kind='bar')

Sorting index

ps.value_counts().sort_index().plot(kind='bar')

or values

ps.value_counts().sort_values().plot(kind='bar')

Convert two series of numbers into a series of the list of numbers

ps['col1'].map( lambda x: [x] )+ps['col1'].map( lambda x: [x] )

Convert a list of integers into a list of strings and join them with separator

ps['col_list'].map( lambda x: 
                            '\n'.join(  
                                list( map(str, x) ) 
                                  ) 
                            )

Obtain elements by slice

ps.iloc[1:7]

Fill missing keys from a list with empty string

def add_blank_missing_keys(ps,keys):
    '''
    Check if the keys are in a Pandas Series.
    If not the key value is initialized with
    the empty string
    '''
    for k in keys:
        ps[k]=ps.get(k)
    #Replace None with empty string
    return ps.fillna('')    

Advanced examples

lambda function to combine two columns with conditonal in both columns

Normalize a column with NaN to string date format ‘YYYY-MM-DD’ from a column with integer year:

df.date.combine(df.year,func=lambda x,y: y if y>-1 and pd.isnull(x) else x)