原始碼分析 Mybatis 的 foreach 為什麼會出現效能問題

SQL MyBatis 原始碼分析 · 發表 2019-03-16 01:08:50

摘要：背景最近在做一個類似於綜合報表之類的東西，需要查詢所有的記錄（資料庫記錄有限制），大概有1W條記錄，該報表需要三個表的資料，也就是根據這 1W 個 ID 去執行查詢三次資料庫，其中，有一條查詢 SQL 是自己寫，其他兩條是根據別人提供的介面進行查詢，剛開始的時候，沒有多想，直接使用 in...

背景

最近在做一個類似於綜合報表之類的東西，需要查詢所有的記錄（資料庫記錄有限制），大概有1W條記錄，該報表需要三個表的資料，也就是根據這 1W 個 ID 去執行查詢三次資料庫，其中，有一條查詢 SQL 是自己寫，其他兩條是根據別人提供的介面進行查詢，剛開始的時候，沒有多想，直接使用 in 進行查詢，使用 Mybatis 的 foreach 語句；專案中使用的是 jsonrpc 來請求資料，在測試的時候，發現老是請求不到資料，日誌丟擲的是 jsonrpc 超時異常，繼續檢視日誌發現，是被阻塞在上面的三條SQL查詢中。

在以前分析 Mybatis 的原始碼的時候，瞭解到，Mybatis 的 foreach 會有效能問題，所以改了下 SQL，直接在程式碼中拼接SQL，然後在 Mybatis 中直接使用 # 來獲取，替換 class 測試了下，果然一下子就能查詢出資料。

前提

這裡先不考慮使用 in 好不好，如何去優化 in，如何使用 exists 或 inner join 進行代替等，這裡就只是考慮使用了 in 語句，且使用了 Mybatis 的 foreach 語句進行優化，其實 foreach 的優化很簡單，就是把 in 後面的語句在程式碼裡面拼接好，在配置檔案中直接通過 #{xxx} 或 ${xxx} 當作字串直接使用即可。

測試

在分析 foreach 原始碼之前，先構造個數據來看看它們的區別有多大。

建表語句：

CREATE TABLE person
(
id int(11) PRIMARY KEY NOT NULL,
name varchar(50),
age int(11),
job varchar(50)
);複製程式碼

插入 1W 條資料：

POJO 類：

@Getter
@Setter
@ToString
@NoArgsConstructor
@AllArgsConstructor
public class Person implements Serializable {
private int id;
private String name;
private String job;
private int age;
}複製程式碼

方式一

通過原始的方式，使用 foreach 語句：

1. 在 dao 裡面定義方法：

List<Person> queryPersonByIds(@Param("ids") List<Integer> ids);
複製程式碼

2. 配置檔案SQL：

<select id="queryPersonByIds" parameterType="list" resultMap="queryPersonMap">
select * from person where 1=1
<if test="ids != null and ids.size() > 0">
and id in
<foreach collection="ids" item="item" index="index" separator="," open="(" close=")">
#{item}
</foreach>
</if>
</select>複製程式碼

3. 執行 main 方法：

@RunWith(SpringJUnit4ClassRunner.class)
@ContextConfiguration(locations = { "classpath:spring-mybatis.xml" })
public class MainTest {

@Autowired
private IPersonService personService;

@Test
public void test(){
// 構造 1W 個 ID
List<Integer> ids = new ArrayList<>();
for (int i = 1; i <= 10000; i++) {
ids.add(i);
}
long start = System.currentTimeMillis();

// 執行三次
personService.queryPersonByIds(ids);
personService.queryPersonByIds(ids);
personService.queryPersonByIds(ids);

long end = System.currentTimeMillis();
System.out.println(String.format("耗時：%d", end - start));
}
}
結果：耗時：2853複製程式碼

可以看到通過 foreach 的方法，大概需要 3s

方式二

在程式碼中封裝 SQL ，在配置檔案中通過 ${xxx} 來獲取：

1. 在 dao 新增方法：

List<Person> queryPersonByIds2(@Param("ids") String ids);
複製程式碼

2. 配置檔案SQL：

<select id="queryPersonByIds2" parameterType="String" resultMap="queryPersonMap">
select * from person where 1=1
<if test="ids != null and ids != ''">
and id in ${ids}
</if>
</select>複製程式碼

3. 執行 main 方法：

@Test
public void test_3(){
// 拼接 SQL 
StringBuffer sb = new StringBuffer();
sb.append("(");
for (int i = 1; i < 10000; i++) {
sb.append(i).append(",");
}
sb.deleteCharAt(sb.toString().length() - 1);
sb.append(")");
// 最終的 SQL 為 (1,2,3,4,5...)

long start2 = System.currentTimeMillis();

// 執行三次
personService.queryPersonByIds2(sb.toString());
personService.queryPersonByIds2(sb.toString());
personService.queryPersonByIds2(sb.toString());

long end2 = System.currentTimeMillis();
System.out.println(String.format("耗時：%d", end2 - start2));
}
結果：耗時：360複製程式碼

通過拼接 SQL，使用 ${xxx} 的方式，執行同樣的 SQL ，耗時大概 360 ms

方式三

在程式碼中封裝 SQL ，在配置檔案中通過 #{xxx} 來獲取：

1. 在 dao 中新增方法：

List<Person> queryPersonByIds3(@Param("ids") String ids);
複製程式碼

2. 配置檔案SQL：

<select id="queryPersonByIds3" parameterType="String" resultMap="queryPersonMap">
select * from person where 1=1
<if test="ids != null and ids != ''">
and id in (#{ids})
</if>
</select>複製程式碼

3. 執行 main 方法：

@Test
public void test_3(){
// 拼接 SQL
StringBuffer sb2 = new StringBuffer();
for (int i = 1; i < 10000; i++) {
sb2.append(i).append(",");
}
sb2.deleteCharAt(sb2.toString().length() - 1);
// 最終的SQL為 1,2,3,4,5....

long start3 = System.currentTimeMillis();

personService.queryPersonByIds3(sb2.toString());
personService.queryPersonByIds3(sb2.toString());
personService.queryPersonByIds3(sb2.toString());

long end3 = System.currentTimeMillis();
System.out.println(String.format("耗時：%d", end3 - start3));
}
結果：耗時：30複製程式碼

通過拼接 SQL，使用 #{xxx} 的方式，執行同樣的 SQL ，耗時大概 30 ms

總結

通過上面三種方式可以看到，使用不同的方式，耗時的差別還是麻大的，最快的是拼接 SQL，使用 #{xxx} 當作字串處理，最慢的是 foreach。為什麼 foreach 會慢那麼多呢，後面再分析原始碼的時候再進行分析；而這裡同樣是拼接 SQL 的方式，#{xxx} 和 ${xxx} 耗時卻相差 10 倍左右；我們知道，Mybatis 在解析 # 和 $ 這兩種不同的符號時，採用不同的處理策略；使用過 JDBC 的都知道，通過 JDBC 執行 SQL 有兩種方式： Statment 物件和PreparedStatment 物件， PreparedStatment 表示預編譯的SQL，包含的SQL已經預編譯過了，SQL 中的引數部分使用？進行佔位，之後使用 setXXX 進行賦值，當使用 Statement 物件時，每次執行一個SQL命令時，都會對它進行解析和編譯。所有 PreparedStatment 效率要高一些。那麼 Mybatis 在解析 # 和 $ 的時候，分別對應的是這兩種物件，# 被解析成 PreparedStatment 物件，通過 ? 進行佔位，之後再賦值，而 $ 被解析成 Statement ，通過直接拼接SQL的方式賦值，所以，為什麼同樣是通過在程式碼中拼接 SQL ，# 和 $ 的耗時不同的原因。

PS：上面只是介紹了三種方式，應該沒有人問，拼接SQL為 (1,2,3,4,5)，在配置SQL中通過 #{xxx} 來獲取吧

foreach 原始碼解析

下面來看下 foreach 是如何被解析的，最終解析的 SQL 是什麼樣的：

在 Mybatis 中，foreach 屬於動態標籤的一種，也是最智慧的其中一種，Mybatis 每個動態標籤都有對應的類來進行解析，而 foreach 主要是由 ForEachSqlNode 負責解析。

ForeachSqlNode 主要是用來解析 <foreach> 節點的，先來看看 <foreach> 節點的用法：

<select id="queryPersonByIds" parameterType="list" resultMap="queryPersonMap">
select * from person where 1=1
<if test="ids != null and ids.size() > 0">
and id in
<foreach collection="ids" item="item" index="index" separator="," open="(" close=")">
#{item}
</foreach>
</if>
</select>複製程式碼

最終被資料庫執行的 SQL 為 select * from person where 1=1 and id in (1,2,3,4,5)

先來看看它的兩個內部類：

PrefixedContext

該類主要是用來處理字首，比如 "(" 等。

private class PrefixedContext extends DynamicContext {
private DynamicContext delegate;
// 指定的字首
private String prefix;
// 是否處理過字首
private boolean prefixApplied;
// .......

@Override
public void appendSql(String sql) {
// 如果還沒有處理字首，則新增字首
if (!prefixApplied && sql != null && sql.trim().length() > 0) {
delegate.appendSql(prefix);
prefixApplied = true;
}
// 拼接SQL
delegate.appendSql(sql);
}
}複製程式碼

FilteredDynamicContext

FilteredDynamicContext 是用來處理 #{} 佔位符的，但是並未繫結引數，只是把 #{item} 轉換為 #{_frch_item_1} 之類的佔位符。

private static class FilteredDynamicContext extends DynamicContext {
private DynamicContext delegate;
//對應集合項在集合的索引位置
private int index;
// item的索引
private String itemIndex;
// item的值
private String item;
//.............
// 解析 #{item}
@Override
public void appendSql(String sql) {
GenericTokenParser parser = new GenericTokenParser("#{", "}", new TokenHandler() {
@Override
public String handleToken(String content) {
// 把 #{itm} 轉換為 #{__frch_item_1} 之類的
String newContent = content.replaceFirst("^\\s*" + item + "(?![^.,:\\s])", itemizeItem(item, index));
// 把 #{itmIndex} 轉換為 #{__frch_itemIndex_1} 之類的
if (itemIndex != null && newContent.equals(content)) {
newContent = content.replaceFirst("^\\s*" + itemIndex + "(?![^.,:\\s])", itemizeItem(itemIndex, index));
}
// 再返回 #{__frch_item_1} 或 #{__frch_itemIndex_1}
return new StringBuilder("#{").append(newContent).append("}").toString();
}
});
// 拼接SQL
delegate.appendSql(parser.parse(sql));
}
private static String itemizeItem(String item, int i) {
return new StringBuilder("__frch_").append(item).append("_").append(i).toString();
}
}複製程式碼

ForeachSqlNode

瞭解了 ForeachSqlNode 它的兩個內部類之後，再來看看它的實現：

public class ForEachSqlNode implements SqlNode {
public static final String ITEM_PREFIX = "__frch_";
// 判斷迴圈的終止條件
private ExpressionEvaluator evaluator;
// 迴圈的集合
private String collectionExpression;
// 子節點
private SqlNode contents;
// 開始字元
private String open;
// 結束字元
private String close;
// 分隔符
private String separator;
// 本次迴圈的元素，如果集合為 map，則index 為key，item為value
private String item;
// 本次迴圈的次數
private String index;
private Configuration configuration;

// ...............

@Override
public boolean apply(DynamicContext context) {
// 獲取引數
Map<String, Object> bindings = context.getBindings();
final Iterable<?> iterable = evaluator.evaluateIterable(collectionExpression, bindings);
if (!iterable.iterator().hasNext()) {
return true;
}
boolean first = true;
// 新增開始字串
applyOpen(context);
int i = 0;
for (Object o : iterable) {
DynamicContext oldContext = context;
if (first) {
// 如果是集合的第一項，則字首prefix為空字串
context = new PrefixedContext(context, "");
} else if (separator != null) {
// 如果分隔符不為空，則指定分隔符
context = new PrefixedContext(context, separator);
} else {
// 不指定分隔符，在預設為空
context = new PrefixedContext(context, "");
}
int uniqueNumber = context.getUniqueNumber();
if (o instanceof Map.Entry) {
// 如果集合是map型別，則將集合中的key和value新增到bindings引數集合中儲存
Map.Entry<Object, Object> mapEntry = (Map.Entry<Object, Object>) o;
// 所以迴圈的集合為map型別，則index為key，item為value，就是在這裡設定的
applyIndex(context, mapEntry.getKey(), uniqueNumber);
applyItem(context, mapEntry.getValue(), uniqueNumber);
} else {
// 不是map型別，則將集合中元素的索引和元素新增到 bindings集合中
applyIndex(context, i, uniqueNumber);
applyItem(context, o, uniqueNumber);
}
// 呼叫 FilteredDynamicContext 的apply方法進行處理
contents.apply(new FilteredDynamicContext(configuration, context, index, item, uniqueNumber));
if (first) {
first = !((PrefixedContext) context).isPrefixApplied();
}
context = oldContext;
i++;
}
// 新增結束字串
applyClose(context);
return true;
}

private void applyIndex(DynamicContext context, Object o, int i) {
if (index != null) {
context.bind(index, o); // key為idnex，value為集合元素
context.bind(itemizeItem(index, i), o); // 為index新增字首和字尾形成新的key
}
}

private void applyItem(DynamicContext context, Object o, int i) {
if (item != null) {
context.bind(item, o);
context.bind(itemizeItem(item, i), o);
}
}
}複製程式碼

所以該例子：

<select id="queryPersonByIds" parameterType="list" resultMap="queryPersonMap">
select * from person where 1=1
<if test="ids != null and ids.size() > 0">
and id in
<foreach collection="ids" item="item" index="index" separator="," open="(" close=")">
#{item}
</foreach>
</if>
</select>複製程式碼

解析之後的 SQL 為：

select * from person where 1=1 and id in (#{__frch_item_0}, #{__frch_item_1}, #{__frch_item_2}, #{__frch_item_3}, #{__frch_item_4})

之後在通過 PreparedStatment 的 setXXX 來進行賦值。

所以，到這裡，知道了 Mybatis 在解析 foreach 的時候，最後還是解析成了

的方式，但是為什麼還是很慢呢，這是因為需要迴圈解析 #{__frch_item_0} 之類的佔位符，foreach 的集合越大，解析越慢。既然知道了需要解析佔位符，為何不自己拼接呢，所以就可以在程式碼中拼接好，而不再使用 foreach 啦。

所以，Mybatis 在解析 foreach 的時候，底層還是會解析成

號的形式而不是

的形式，既然知道了這個，如果需要 foreach 的集合很大，就可以使用程式碼拼接 SQL ，使用

(#{xxx}) 的方式進行獲取，不要再拼接成 (1,2,3,4,5) 再使用 ${xxx}

的方式啦。

原始碼分析 Mybatis 的 foreach 為什麼會出現效能問題

背景

前提

測試

方式一

方式二

方式三

總結

foreach 原始碼解析

PrefixedContext

FilteredDynamicContext

ForeachSqlNode

您可能也會喜歡…