- Published on
3.3.推导式
- Authors

- Name
- xiaobai
1.概述
推导式是Python中一种简洁、高效的创建数据结构的方法,可以用更少的代码生成列表、字典、集合等。推导式让代码更加简洁、可读性更强,同时性能通常比传统循环更好。
2.核心概念
2.1.推导式的优势
| 优势 | 描述 | 示例 |
|---|---|---|
| 代码简洁 | 用一行表达式完成循环与条件判断 | [x**2 for x in range(5)] |
| 可读性强 | 结构清晰,表达意图明确 | 比传统for循环更直观 |
| 性能优越 | 通常比循环+append更快 | 底层优化实现 |
| 功能丰富 | 支持条件过滤、嵌套循环等 | 复杂数据处理 |
2.2.推导式类型
| 类型 | 语法 | 结果类型 | 示例 |
|---|---|---|---|
| 列表推导式 | [expr for item in iterable] | list | [x**2 for x in range(5)] |
| 字典推导式 | {key: value for item in iterable} | dict | {x: x**2 for x in range(5)} |
| 集合推导式 | {expr for item in iterable} | set | {x**2 for x in range(5)} |
| 生成器表达式 | (expr for item in iterable) | generator | (x**2 for x in range(5)) |
3.列表推导式 (List Comprehensions)
3.1.基本语法
[表达式 for 变量 in 可迭代对象 (可选的if条件)]
组成部分:
- 表达式:对每个元素进行处理的代码
- for 变量 in 可迭代对象:遍历数据源
- if 条件:可选的过滤条件
3.2.基本示例
# 传统方式
squares = []
for x in range(5):
squares.append(x**2)
print(squares) # [0, 1, 4, 9, 16]
# 列表推导式
squares = [x**2 for x in range(5)]
print(squares) # [0, 1, 4, 9, 16]
3.3.带条件的列表推导式
# 只包含偶数的平方
even_squares = [x**2 for x in range(10) if x % 2 == 0]
print(even_squares) # [0, 4, 16, 36, 64]
# 多个条件
numbers = [x for x in range(20) if x % 2 == 0 if x % 3 == 0]
print(numbers) # [0, 6, 12, 18]
# 条件表达式(三元运算符)
results = [x if x % 2 == 0 else 'odd' for x in range(5)]
print(results) # [0, 'odd', 2, 'odd', 4]
3.4.嵌套循环
# 二维列表展开
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flattened = [num for row in matrix for num in row]
print(flattened) # [1, 2, 3, 4, 5, 6, 7, 8, 9]
# 等价于:
flattened = []
for row in matrix:
for num in row:
flattened.append(num)
# 创建乘法表
multiplication_table = [[i * j for j in range(1, 6)] for i in range(1, 6)]
print(multiplication_table)
# [[1, 2, 3, 4, 5],
# [2, 4, 6, 8, 10],
# [3, 6, 9, 12, 15],
# [4, 8, 12, 16, 20],
# [5, 10, 15, 20, 25]]
4.字典推导式 (Dictionary Comprehensions)
4.1.基本语法
{键表达式: 值表达式 for 变量 in 可迭代对象 (可选的if条件)}
4.2.基本示例
# 最基础的字典推导式
d = {x: x**2 for x in range(5)}
print(d) # {0: 0, 1: 1, 2: 4, 3: 9, 4: 16}
# 带条件的字典推导式
d = {x: x**2 for x in range(5) if x % 2 == 0}
print(d) # {0: 0, 2: 4, 4: 16}
4.3.实际应用场景
4.3.1.1. 数据过滤
# 从分数字典中只保留及格的学生
scores = {
'Alice': 85,
'Bob': 58,
'Charlie': 71,
'David': 49
}
passed = {name: score for name, score in scores.items() if score > 60}
print(passed) # {'Alice': 85, 'Charlie': 71}
4.3.2.2. 键值交换
# 交换字典中的键和值
fruit_colors = {'apple': 'red', 'banana': 'yellow', 'grape': 'purple'}
color_fruits = {color: fruit for fruit, color in fruit_colors.items()}
print(color_fruits) # {'red': 'apple', 'yellow': 'banana', 'purple': 'grape'}
4.3.3.3. 条件变换
# 将不及格成绩标记为"不及格"
scores = {
'Alice': 85,
'Bob': 58,
'Charlie': 71,
'David': 49
}
result = {name: (score if score >= 60 else '不及格') for name, score in scores.items()}
print(result) # {'Alice': 85, 'Bob': '不及格', 'Charlie': 71, 'David': '不及格'}
4.4.带条件的字典推导式
# 只包含值大于2的项
numbers = {'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5}
filtered_dict = {k: v for k, v in numbers.items() if v > 2}
print(filtered_dict) # {'c': 3, 'd': 4, 'e': 5}
# 根据条件修改值
processed_dict = {k: v * 2 if v % 2 == 0 else v for k, v in numbers.items()}
print(processed_dict) # {'a': 1, 'b': 4, 'c': 3, 'd': 8, 'e': 5}
# 处理两个列表创建字典
keys = ['name', 'age', 'city']
values = ['Alice', 25, 'New York']
person_dict = {keys[i]: values[i] for i in range(len(keys))}
print(person_dict) # {'name': 'Alice', 'age': 25, 'city': 'New York'}
5.集合推导式 (Set Comprehensions)
5.1.基本语法
{表达式 for 变量 in 可迭代对象 (可选的if条件)}
5.2.基本示例
# 创建唯一平方数的集合
squares_set = {x**2 for x in range(-5, 6)}
print(squares_set) # {0, 1, 4, 9, 16, 25}
# 从列表去重
words = ['hello', 'world', 'hello', 'python', 'world']
unique_words = {word for word in words}
print(unique_words) # {'hello', 'world', 'python'}
# 带条件的集合推导式
even_squares = {x**2 for x in range(10) if x % 2 == 0}
print(even_squares) # {0, 64, 4, 36, 16}
6.生成器表达式 (Generator Expressions)
6.1.基本语法
(表达式 for 变量 in 可迭代对象 (可选的if条件))
6.2.基本示例
# 生成器表达式
squares_gen = (x**2 for x in range(5))
print(squares_gen) # <generator object <genexpr> at 0x...>
# 转换为列表
print(list(squares_gen)) # [0, 1, 4, 9, 16]
# 带条件的生成器表达式
even_squares_gen = (x**2 for x in range(10) if x % 2 == 0)
print(list(even_squares_gen)) # [0, 4, 16, 36, 64]
6.3.生成器表达式的优势
# 内存效率对比
import sys
n = 100000
# 列表推导式 - 立即创建所有元素
list_comp = [x**2 for x in range(n)]
# 生成器表达式 - 惰性计算
gen_expr = (x**2 for x in range(n))
print(f"列表推导式内存: {sys.getsizeof(list_comp)} 字节") # 800984 字节
print(f"生成器表达式内存: {sys.getsizeof(gen_expr)} 字节") # 200 字节
7.推导式的嵌套和复杂用法
7.1.多层嵌套推导式
# 三维嵌套列表展平
three_d = [[[1, 2], [3, 4]], [[5, 6], [7, 8]]]
flattened_3d = [num for matrix in three_d for row in matrix for num in row]
print(flattened_3d) # [1, 2, 3, 4, 5, 6, 7, 8]
# 使用字典推导式创建嵌套字典
nested_dict = {
f'group_{i}': {f'item_{j}': i * j for j in range(1, 4)}
for i in range(1, 4)
}
print(nested_dict)
# {'group_1': {'item_1': 1, 'item_2': 2, 'item_3': 3},
# 'group_2': {'item_1': 2, 'item_2': 4, 'item_3': 6},
# 'group_3': {'item_1': 3, 'item_2': 6, 'item_3': 9}}
7.2.复杂条件逻辑
# 复杂条件筛选
numbers = range(20)
complex_filter = [
x for x in numbers
if (x % 2 == 0 and x < 10) or (x % 3 == 0 and x > 10)
]
print(complex_filter) # [0, 2, 4, 6, 8, 12, 15, 18]
# 使用函数进行复杂判断
def is_prime(n):
if n < 2:
return False
for i in range(2, int(n**0.5) + 1):
if n % i == 0:
return False
return True
# 筛选素数
primes = [x for x in range(2, 30) if is_prime(x)]
print(primes) # [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
8.实际应用场景
8.1.数据清洗和转换
# 字符串清洗
raw_data = [' hello ', 'WORLD', ' Python ', 'PROGRAMMING']
cleaned_data = [word.strip().lower() for word in raw_data]
print(cleaned_data) # ['hello', 'world', 'python', 'programming']
# 提取数字
text_lines = ['Price: $100', 'Weight: 2.5kg', 'Count: 25 items']
numbers = [float(''.join(c for c in line if c.isdigit() or c == '.'))
for line in text_lines if any(c.isdigit() for c in line)]
print(numbers) # [100.0, 2.5, 25.0]
8.2.文件处理
def process_file_data(filename):
"""处理文件数据"""
with open(filename, 'r', encoding='utf-8') as file:
# 读取非空行并去除空白
lines = [line.strip() for line in file if line.strip()]
# 筛选包含关键词的行
keyword_lines = [line for line in lines if 'error' in line.lower()]
# 创建行号字典
line_dict = {i: line for i, line in enumerate(lines, 1)}
return lines, keyword_lines, line_dict
# 使用示例
lines, errors, numbered = process_file_data('log.txt')
print("所有行:", lines)
print("错误行:", errors)
print("行号字典:", numbered)
8.3.数据分析和处理
# 学生成绩分析
students = [
{'name': 'Alice', 'scores': [85, 92, 78]},
{'name': 'Bob', 'scores': [76, 88, 95]},
{'name': 'Charlie', 'scores': [90, 85, 92]}
]
# 计算平均分
student_averages = {
student['name']: sum(student['scores']) / len(student['scores'])
for student in students
}
print(student_averages) # {'Alice': 85.0, 'Bob': 86.33, 'Charlie': 89.0}
# 找出高分学生
high_achievers = [
student['name'] for student in students
if sum(student['scores']) / len(student['scores']) > 85
]
print(high_achievers) # ['Bob', 'Charlie']
# 找出所有科目及格的学生
passing_students = [
student['name'] for student in students
if all(score >= 60 for score in student['scores'])
]
print(passing_students) # ['Alice', 'Bob', 'Charlie']
9.性能比较
9.1.推导式 vs 传统循环
import timeit
def test_performance():
n = 10000
# 列表推导式
list_comp_time = timeit.timeit(
'[x**2 for x in range(n)]',
globals={'n': n},
number=1000
)
# 传统 for 循环
for_loop_time = timeit.timeit(
'''
result = []
for x in range(n):
result.append(x**2)
''',
globals={'n': n},
number=1000
)
# 生成器表达式
gen_expr_time = timeit.timeit(
'list(x**2 for x in range(n))',
globals={'n': n},
number=1000
)
print(f"列表推导式: {list_comp_time:.4f}秒")
print(f"传统循环: {for_loop_time:.4f}秒")
print(f"生成器表达式: {gen_expr_time:.4f}秒")
test_performance()
9.2.性能总结
| 方法 | 性能 | 内存使用 | 适用场景 |
|---|---|---|---|
| 列表推导式 | 最快 | 高(立即创建) | 需要完整列表 |
| 传统循环 | 中等 | 高(立即创建) | 复杂逻辑 |
| 生成器表达式 | 中等 | 低(惰性计算) | 大数据集 |
10.高级技巧和模式
10.1.使用 walrus 运算符 (Python 3.8+)
海象运算符(:=)允许在表达式内部为变量赋值,让代码更简洁高效。
# 在推导式中使用 walrus 运算符
data = [" apple ", "banana", " ", "cherry", "date "]
results = [clean.strip().upper()
for item in data
if (clean := item.strip())] # 只有非空才加入
print(results) # ['APPLE', 'BANANA', 'CHERRY', 'DATE']
# 多条件处理
numbers = ['42', 'not_a_num', '99', 'abc', '123']
valid_numbers = [int(s) for s in numbers if s.isdigit()]
print(valid_numbers) # [42, 99, 123]
10.2.推导式中的异常处理
# 安全的类型转换
raw_data = ['123', '456', 'abc', '789', 'def']
def safe_int_convert(value):
try:
return int(value)
except ValueError:
return None
# 使用安全转换函数
numbers = [safe_int_convert(x) for x in raw_data]
clean_numbers = [x for x in numbers if x is not None]
print(clean_numbers) # [123, 456, 789]
# 更简洁的方式
numbers = [int(x) for x in raw_data if x.isdigit()]
print(numbers) # [123, 456, 789]
10.3.多步骤数据处理管道
def data_processing_pipeline(data):
"""多步骤数据处理"""
# 步骤1: 清理和验证
cleaned = [item.strip().lower() for item in data if item.strip()]
# 步骤2: 过滤和转换
processed = [
f"processed_{item}" for item in cleaned
if len(item) > 3 and not item.startswith('test')
]
# 步骤3: 创建查找字典
lookup = {item: len(item) for item in processed}
return cleaned, processed, lookup
# 使用管道
raw_data = [' Hello ', ' test_data ', 'WORLD ', ' py ', ' ']
cleaned, processed, lookup = data_processing_pipeline(raw_data)
print("Cleaned:", cleaned) # ['hello', 'test_data', 'world', 'py']
print("Processed:", processed) # ['processed_hello', 'processed_world']
print("Lookup:", lookup) # {'processed_hello': 15, 'processed_world': 15}
11.最佳实践和注意事项
11.1.可读性考虑
# 好的写法:简单明了
squares = [x**2 for x in range(10)]
# 避免过于复杂的推导式
# 复杂逻辑应该使用传统循环
def generate_triples():
result = []
for x in range(5):
for y in range(5):
if x != y:
for z in range(5):
if x + y > z and (x % 2 == 0 or y % 2 == 1):
result.append((x, y, z))
return result
11.2.何时使用推导式
11.2.1.适合使用推导式的情况:
# 1. 简单的数据转换和过滤
numbers = [1, 2, 3, 4, 5]
doubled_evens = [x * 2 for x in numbers if x % 2 == 0]
print(doubled_evens) # [4, 8]
# 2. 创建新的数据结构
word_lengths = {word: len(word) for word in ['hello', 'world']}
print(word_lengths) # {'hello': 5, 'world': 5}
# 3. 简单的数据提取
people_list = [{'name': 'Alice', 'age': 25}, {'name': 'Bob', 'age': 30}]
names = [person['name'] for person in people_list if person['age'] > 18]
print(names) # ['Alice', 'Bob']
11.2.2.不适合使用推导式的情况:
- 复杂的逻辑 - 使用传统循环
- 需要多次引用中间结果
- 需要异常处理的复杂操作
- 嵌套过深影响可读性
12.综合实战示例
12.1.数据分析任务
def analyze_sales_data(sales_records):
"""分析销售数据"""
# 数据清理:筛选出有效的记录
valid_records = [
record for record in sales_records
if record.get('amount', 0) > 0 and record.get('product')
]
# 创建按产品分组的销售总额字典
sales_by_product = {}
for record in valid_records:
product = record['product']
amount = record['amount']
sales_by_product[product] = sales_by_product.get(product, 0) + amount
# 找出高销售额的产品
high_sales_products = [
product for product, total in sales_by_product.items()
if total > 1000
]
# 创建销售报告
sales_report = {
product: {
'total_sales': total,
'average_sale': total / len([r for r in valid_records if r['product'] == product]),
'is_high_sales': total > 1000
}
for product, total in sales_by_product.items()
}
return {
'total_records': len(valid_records),
'high_sales_products': high_sales_products,
'sales_report': sales_report
}
# 示例数据
sample_sales = [
{'product': 'A', 'amount': 100},
{'product': 'B', 'amount': 200},
{'product': 'A', 'amount': 150},
{'product': 'C', 'amount': 300},
{'product': 'B', 'amount': 250},
{'product': 'A', 'amount': 1200}, # 高销售额
]
# 分析结果
result = analyze_sales_data(sample_sales)
print(result)
12.2.文本处理任务
def process_text_data(texts):
"""处理文本数据"""
# 1. 清理文本
cleaned_texts = [
text.strip().lower() for text in texts if text.strip()
]
# 2. 提取关键词
keywords = [
word for text in cleaned_texts
for word in text.split()
if len(word) > 3
]
# 3. 统计词频
word_freq = {
word: keywords.count(word)
for word in set(keywords)
}
# 4. 找出高频词
high_freq_words = [
word for word, freq in word_freq.items()
if freq > 1
]
return {
'cleaned_texts': cleaned_texts,
'keywords': keywords,
'word_frequency': word_freq,
'high_frequency_words': high_freq_words
}
# 使用示例
texts = [
"Python is great for data analysis",
"Data analysis with Python is powerful",
"Python programming is fun and useful"
]
result = process_text_data(texts)
print("清理后的文本:", result['cleaned_texts'])
print("关键词:", result['keywords'])
print("词频统计:", result['word_frequency'])
print("高频词:", result['high_frequency_words'])
13.总结
推导式是Python中强大的数据处理工具,通过合理使用推导式,我们可以:
- 提高代码效率:推导式通常比传统循环更快
- 简化代码逻辑:用更少的代码表达复杂的操作
- 提高可读性:代码结构更清晰,意图更明确
- 减少错误:减少手动循环中的常见错误
13.1.学习建议
- 从简单开始:先掌握基本的列表推导式
- 逐步深入:学习字典和集合推导式
- 实践应用:在实际项目中多使用推导式
- 注意可读性:避免过度复杂的推导式
- 性能考虑:了解不同推导式的性能特点
掌握推导式是成为优秀Python程序员的重要技能,它们让我们的代码更加优雅和高效。
