Skip to content

3328. Find Cities in Each State II πŸ”’

Description

Table: cities

+-------------+---------+
| Column Name | Type    | 
+-------------+---------+
| state       | varchar |
| city        | varchar |
+-------------+---------+
(state, city) is the combination of columns with unique values for this table.
Each row of this table contains the state name and the city name within that state.

Write a solution to find all the cities in each state and analyze them based on the following requirements:

  • Combine all cities into a comma-separated string for each state.
  • Only include states that have at least 3 cities.
  • Only include states where at least one city starts with the same letter as the state name.

Return the result table ordered by the count of matching-letter cities in descending order and then by state name in ascending order.

The result format is in the following example.

 

Example:

Input:

cities table:

+--------------+---------------+
| state        | city          |
+--------------+---------------+
| New York     | New York City |
| New York     | Newark        |
| New York     | Buffalo       |
| New York     | Rochester     |
| California   | San Francisco |
| California   | Sacramento    |
| California   | San Diego     |
| California   | Los Angeles   |
| Texas        | Tyler         |
| Texas        | Temple        |
| Texas        | Taylor        |
| Texas        | Dallas        |
| Pennsylvania | Philadelphia  |
| Pennsylvania | Pittsburgh    |
| Pennsylvania | Pottstown     |
+--------------+---------------+

Output:

+-------------+-------------------------------------------+-----------------------+
| state       | cities                                    | matching_letter_count |
+-------------+-------------------------------------------+-----------------------+
| Pennsylvania| Philadelphia, Pittsburgh, Pottstown       | 3                     |
| Texas       | Dallas, Taylor, Temple, Tyler             | 3                     |
| New York    | Buffalo, Newark, New York City, Rochester | 2                     |
+-------------+-------------------------------------------+-----------------------+

Explanation:

  • Pennsylvania:
    • Has 3 cities (meets minimum requirement)
    • All 3 cities start with 'P' (same as state)
    • matching_letter_count = 3
  • Texas:
    • Has 4 cities (meets minimum requirement)
    • 3 cities (Taylor, Temple, Tyler) start with 'T' (same as state)
    • matching_letter_count = 3
  • New York:
    • Has 4 cities (meets minimum requirement)
    • 2 cities (Newark, New York City) start with 'N' (same as state)
    • matching_letter_count = 2
  • California is not included in the output because:
    • Although it has 4 cities (meets minimum requirement)
    • No cities start with 'C' (doesn't meet the matching letter requirement)

Note:

  • Results are ordered by matching_letter_count in descending order
  • When matching_letter_count is the same (Texas and New York both have 2), they are ordered by state name alphabetically
  • Cities in each row are ordered alphabetically

Solutions

Solution 1: Group Aggregation + Filtering

We can group the cities table by the state field, then apply filtering on each group to retain only the groups that meet the specified conditions.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# Write your MySQL query statement below
SELECT
    state,
    GROUP_CONCAT(city ORDER BY city SEPARATOR ', ') AS cities,
    COUNT(
        CASE
            WHEN LEFT(city, 1) = LEFT(state, 1) THEN 1
        END
    ) AS matching_letter_count
FROM cities
GROUP BY 1
HAVING COUNT(city) >= 3 AND matching_letter_count > 0
ORDER BY 3 DESC, 1;
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
import pandas as pd


def state_city_analysis(cities: pd.DataFrame) -> pd.DataFrame:
    cities["matching_letter"] = cities["city"].str[0] == cities["state"].str[0]

    result = (
        cities.groupby("state")
        .agg(
            cities=("city", lambda x: ", ".join(sorted(x))),
            matching_letter_count=("matching_letter", "sum"),
            city_count=("city", "count"),
        )
        .reset_index()
    )

    result = result[(result["city_count"] >= 3) & (result["matching_letter_count"] > 0)]

    result = result.sort_values(
        by=["matching_letter_count", "state"], ascending=[False, True]
    )

    result = result.drop(columns=["city_count"])

    return result

Comments