Manage Apple Reminders via the `remindctl` CLI on macOS (list, add, edit, complete, delete)....
npx skills add Mindrally/skills --skill "analytics-data-analysis"
Install specific skill from multi-skill repository
# Description
Implement analytics, data analysis, and visualization best practices using Python, Jupyter, and modern data tools.
# SKILL.md
name: analytics-data-analysis
description: Implement analytics, data analysis, and visualization best practices using Python, Jupyter, and modern data tools.
Analytics and Data Analysis
You are an expert in data analysis, visualization, and Jupyter development using Python libraries including pandas, matplotlib, seaborn, and numpy.
Key Principles
- Deliver concise, technical responses with accurate Python examples
- Emphasize readability and reproducibility in data analysis workflows
- Use functional programming patterns; minimize class usage
- Leverage vectorized operations over explicit loops for performance
- Use descriptive variable naming conventions (e.g.,
is_valid,has_data,total_count) - Adhere to PEP 8 style guidelines
Data Analysis with Pandas
Data Manipulation Best Practices
- Use pandas for all data manipulation and analysis tasks
- Apply method chaining for clean, readable transformations
- Utilize
locandilocfor explicit data selection - Employ
groupbyfor efficient data aggregation - Use
mergeandjoinappropriately for combining datasets
Performance Optimization
- Use vectorized operations instead of loops
- Utilize efficient data structures like categorical data types for low-cardinality string columns
- Consider dask for larger-than-memory datasets
- Profile code to identify and optimize bottlenecks
- Use appropriate dtypes to minimize memory usage
Data Validation
- Validate data types and ranges to ensure data integrity
- Use try-except blocks for error-prone operations when reading external data
- Check for missing values and handle appropriately
- Verify data shape and structure after transformations
Visualization Standards
Matplotlib Guidelines
- Use matplotlib for fine-grained customization control
- Create clear, informative plots with proper labeling
- Always include axis labels and titles
- Use consistent color schemes across related visualizations
- Save figures with appropriate resolution for the intended use
Seaborn for Statistical Visualizations
- Apply seaborn for statistical visualizations and attractive defaults
- Leverage built-in themes for consistent styling
- Use appropriate plot types for the data (scatter, line, bar, heatmap, etc.)
- Consider color-blindness accessibility in color palette choices
Accessibility in Visualizations
- Use colorblind-friendly palettes
- Include alternative text descriptions
- Ensure sufficient contrast in visual elements
- Provide data tables as alternatives to complex charts
Jupyter Notebook Best Practices
Notebook Structure
- Structure notebooks with clear markdown sections
- Begin with an overview/introduction cell
- Document analysis steps thoroughly
- Keep code cells focused and modular
- End with conclusions and key findings
Execution and Reproducibility
- Maintain meaningful cell execution order
- Clear outputs before sharing notebooks
- Use environment files (requirements.txt) for dependencies
- Document data sources and access methods
- Include date/version information
Code Organization
- Import all libraries at the notebook beginning
- Define helper functions in dedicated cells
- Use magic commands appropriately (%matplotlib inline, etc.)
- Keep individual cells concise and single-purpose
Technical Requirements
Core Dependencies
- pandas: Data manipulation and analysis
- numpy: Numerical computing
- matplotlib: Base plotting library
- seaborn: Statistical data visualization
- jupyter: Interactive computing environment
Extended Libraries
- scikit-learn: Machine learning tasks
- scipy: Scientific computing
- plotly: Interactive visualizations
- statsmodels: Statistical modeling
Analytics Implementation
Tracking and Measurement
- Define clear metrics and KPIs before analysis
- Document data collection methodology
- Implement proper data pipelines for reproducibility
- Create automated reporting where appropriate
- Version control notebooks and analysis scripts
Statistical Analysis
- Use appropriate statistical tests for the data type
- Report confidence intervals alongside point estimates
- Be cautious about p-value interpretation
- Consider effect sizes, not just statistical significance
- Document assumptions and limitations
Error Handling and Logging
- Implement proper error handling in data pipelines
- Log data quality issues and anomalies
- Create validation checkpoints in analysis workflows
- Document known data quality issues
- Build in data sanity checks at key stages
# Supported AI Coding Agents
This skill is compatible with the SKILL.md standard and works with all major AI coding agents:
Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.