8000 GitHub - alexleybourne/emoji-data-cleaner: Cleans emoji data for a smaller and easier to use file
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

alexleybourne/emoji-data-cleaner

Repository files navigation

🚀 Emoji Data Optimizer

A utility to optimize emoji data for efficient use in applications. This tool takes the raw emoji data and transforms it into a compact, optimized format while preserving all essential information.

📊 Overview

This project optimizes the emoji-data dataset from iamcal, transforming the verbose format into a minimal JSON structure that:

  • ✅ Reduces file size significantly (93.96% smaller! 1.71MB → 105.80KB)
  • 🔍 Maintains all searchable names
  • 🎨 Preserves skin tone variations
  • 📁 Organizes by category
  • ↕️ Pre-sorts all emojis

The optimized format is ideal for applications that need to include emoji data while minimizing sorting logic and bundle size.

📝 Data Source

Raw emoji data (RawEmojiData.json) is sourced from iamcal/emoji-data.

🔧 Usage

📦 Installation

# Clone the reposito
8000
ry
git clone <repository-url>
cd <repository-directory>

# Install dependencies
yarn install

💻 Commands

# Process and optimize the emoji data
yarn emoji clean

# Show statistics about the emoji data
yarn emoji info

# Generate TypeScript type definitions
yarn emoji types

# Display help information
yarn emoji help

# Run tests
yarn test
# or if you don't have yarn installed
npx ts-node emojiTests.ts

🧩 Using the Optimized Data

Once processed, you can use the data in multiple ways:

Direct Import

import emojiData from './EmojiData.json';

// Access emojis by category
const peopleEmojis = emojiData['People & Body'];

Using Utility Functions

The project includes a utility library with helpful functions:

import emojiUtil from './emojiUtil';

// Find an emoji by name
const thumbsUp = emojiUtil.findEmojiByName('thumbs up');

// Get all emojis from a category
const smileys = emojiUtil.getEmojisByCategory('Smileys & Emotion');

// Convert unicode to emoji character
const thumbsUpChar = emojiUtil.unicodeToEmoji('1F44D');

// Get random emoji
const randomEmoji = emojiUtil.getRandomEmoji();

// Get all emoji categories
const categories = emojiUtil.getCategories();

// Get all skin tone variants for an emoji
const skinToneVariants = emojiUtil.getAllSkinToneVariants(thumbsUp);
console.log(skinToneVariants);
// Output: { 
//   'light': '👍🏻', 
//   'medium-light': '👍🏼', 
//   'medium': '👍🏽', 
//   'medium-dark': '👍🏾', 
//   'dark': '👍🏿' 
// }

💾 Data Format

The optimized EmojiData.json uses the following compact structure:

{
  "Category Name": [
    {
      "n": ["emoji name", "alternative name", "another name"], // First value is the name and the rest are search terms
      "u": "1F44D",  // Unicode codepoint
      "v": ["u-1F3FB", "u-1F3FC"]  // Optional skin tone variations (optimized!)
    },
    // More emojis...
  ],
  // More categories...
}

Where:

  • n: Array of names, including the primary name and alternatives
  • u: Unicode representation of the emoji
  • v: Optional array of skin tone variations (using the u- prefix to save space)

📐 TypeScript Support

The project includes full TypeScript support with automatically generated type definitions:

  • Dynamic Types: Types are generated based on the actual data, ensuring type safety
  • Category-Specific Types: All emoji categories are mapped in a type-safe way
  • Type-Safe Access: Use the EmojiCategoryType union for type-safe category access
import { EmojiCategories, EmojiCategoryType, CompactEmoji } from './emojiTypes';
import emojiData from './EmojiData.json';

// Type-safe data
const data = emojiData as EmojiCategories;

// Type-safe category access
const category: EmojiCategoryType = 'Smileys & Emotion';
const smileys = data[category];

🧪 Testing

The project includes a comprehensive test suite to ensure all functionality works correctly. Run the tests with:

yarn test
# or if you don't have yarn installed
npx ts-node emojiTests.ts

Tests include:

  • Category Tests: Verifies all expected emoji categories exist
  • Search Tests: Ensures emoji can be found by name correctly
  • Skin Tone Tests: Verifies skin tone variations work properly
  • Unicode Tests: Checks conversion between unicode and emoji characters
  • Random Tests: Ensures random emoji selection works as expected
  • Count Tests: Validates emoji counting functions
  • Type Tests: Verifies TypeScript types are working correctly

Test output includes clear pass/fail indicators and a summary of results.

📜 License

This project is available under the MIT License. The original emoji data is from iamcal's emoji-data project.

About

Cleans emoji data for a smaller and easier to use file

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0