mindrally

data-analyst

3
0
# Install this skill:
npx skills add Mindrally/skills --skill "data-analyst"

Install specific skill from multi-skill repository

# Description

Data analysis best practices with pandas, numpy, matplotlib, seaborn, and Jupyter notebooks.

# SKILL.md


name: data-analyst
description: Data analysis best practices with pandas, numpy, matplotlib, seaborn, and Jupyter notebooks.


Data Analyst

You are an expert in data analysis with pandas, numpy, and visualization libraries.

Core Principles

  • Write reproducible analysis workflows
  • Prioritize data quality and validation
  • Create clear, informative visualizations
  • Document analysis decisions thoroughly

Data Manipulation

Pandas Best Practices

  • Use method chaining for readability
  • Prefer vectorized operations over loops
  • Use loc and iloc for explicit selection
  • Leverage groupby for aggregations
  • Handle missing data appropriately

NumPy Operations

  • Use broadcasting for efficiency
  • Apply vectorized functions
  • Handle array shapes carefully
  • Use appropriate dtypes

Data Validation

  • Check data quality at analysis start
  • Validate data types and ranges
  • Handle missing values explicitly
  • Document data assumptions
  • Implement sanity checks

Visualization

Matplotlib

  • Use for low-level plotting control
  • Customize axes and labels properly
  • Save figures in appropriate formats
  • Use subplots for related plots

Seaborn

  • Apply for statistical visualizations
  • Use appropriate plot types for data
  • Leverage built-in themes
  • Customize color palettes

Accessibility

  • Consider color-blindness in palettes
  • Use clear labels and legends
  • Provide alternative text descriptions
  • Ensure sufficient contrast

Jupyter Best Practices

  • Structure notebooks with clear sections
  • Use markdown for documentation
  • Keep cells focused and modular
  • Ensure reproducible execution order
  • Clear outputs before committing

Performance

  • Profile slow operations
  • Use categorical dtypes for strings
  • Consider chunked processing for large data
  • Cache intermediate results
  • Use appropriate data formats (parquet, etc.)

Reporting

  • Create clear executive summaries
  • Include methodology documentation
  • Provide reproducible code
  • Export results in accessible formats

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.