Manage Apple Reminders via the `remindctl` CLI on macOS (list, add, edit, complete, delete)....
npx skills add minminminminchew/my-claude-skills
Or install specific skill: npx add-skill https://github.com/minminminminchew/my-claude-skills
# Description
Expert guidance for pandas data manipulation and analysis in Python. Use when working with DataFrames, Series, data cleaning, transformation, aggregation, merging/joining datasets, time series analysis, or any tabular data operations. Triggers on pandas imports, DataFrame operations, CSV/Excel file processing, data wrangling tasks, or questions about efficient pandas patterns.
# SKILL.md
name: pandas-expert
description: Expert guidance for pandas data manipulation and analysis in Python. Use when working with DataFrames, Series, data cleaning, transformation, aggregation, merging/joining datasets, time series analysis, or any tabular data operations. Triggers on pandas imports, DataFrame operations, CSV/Excel file processing, data wrangling tasks, or questions about efficient pandas patterns.
Pandas Expert
Expert guidance for pandas data manipulation and analysis.
Quick Start
import pandas as pd
# Read data
df = pd.read_csv('data.csv')
df = pd.read_excel('data.xlsx', sheet_name='Sheet1')
# Basic inspection
df.head()
df.info()
df.describe()
Core Operations
Selection & Filtering
# Column selection
df['col'] # Single column (Series)
df[['col1', 'col2']] # Multiple columns (DataFrame)
# Row filtering
df[df['col'] > 10]
df.query('col > 10 and other_col == "value"')
df.loc[df['col'].isin(['a', 'b'])]
# loc vs iloc
df.loc[row_labels, col_labels] # Label-based
df.iloc[row_idx, col_idx] # Integer position-based
Data Cleaning
# Missing values
df.isna().sum()
df.dropna(subset=['col'])
df.fillna({'col': 0, 'other': 'unknown'})
# Duplicates
df.drop_duplicates(subset=['key_col'])
# Type conversion
df['col'] = df['col'].astype('int64')
df['date'] = pd.to_datetime(df['date'])
Transformation
# Apply functions
df['new'] = df['col'].apply(lambda x: x * 2)
df['new'] = df.apply(lambda row: row['a'] + row['b'], axis=1)
# String operations
df['col'].str.lower()
df['col'].str.contains('pattern', regex=True)
df['col'].str.split('_', expand=True)
# Mapping values
df['col'].map({'old': 'new'})
df['col'].replace({'old': 'new'})
Aggregation
# GroupBy
df.groupby('category')['value'].sum()
df.groupby(['cat1', 'cat2']).agg({'val1': 'sum', 'val2': 'mean'})
# Pivot tables
pd.pivot_table(df, values='val', index='row', columns='col', aggfunc='sum')
# Crosstab
pd.crosstab(df['a'], df['b'], margins=True)
Merging & Joining
# Merge (SQL-style joins)
pd.merge(df1, df2, on='key')
pd.merge(df1, df2, left_on='id', right_on='user_id', how='left')
# Concatenate
pd.concat([df1, df2], ignore_index=True) # Vertical stack
pd.concat([df1, df2], axis=1) # Horizontal stack
Performance Tips
- Use vectorized operations - Avoid loops, use built-in pandas/numpy methods
- Filter early - Reduce DataFrame size before complex operations
- Use appropriate dtypes -
categoryfor low-cardinality strings,int32vsint64 - Chunk large files -
pd.read_csv('file.csv', chunksize=10000) - Use query() for complex filters - Often faster than boolean indexing
Common Patterns
Read multiple files
from pathlib import Path
dfs = [pd.read_csv(f) for f in Path('data/').glob('*.csv')]
combined = pd.concat(dfs, ignore_index=True)
Safe column renaming
df.columns = df.columns.str.lower().str.replace(' ', '_')
Memory-efficient reading
df = pd.read_csv('large.csv', usecols=['col1', 'col2'], dtype={'col1': 'int32'})
References
- See references/advanced_patterns.md for complex transformations, window functions, and time series operations
# Supported AI Coding Agents
This skill is compatible with the SKILL.md standard and works with all major AI coding agents:
Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.