In my previous articles, I’ve mainly focused on in-depth, lengthy content. Hereby I am starting a new series of more digestible articles – delicious bites of marketing & tech delicacies 😋.
Today, this one is about Google Sheet’s Query Function. This function is nothing short but amazing. For me, it’s the nail in the coffin for MS Excel 😃 But decide for yourself!
- simplifies complex functions and hence improves adaptability & readability
- it’s using the same principles as SQL – so no need to re-think!
- it’s often faster than using complex INDEX/MATCH or VLOOKUP
- You can’t easily ‘sort’ selections which use query(), the sort needs to be part of the query() itself
- If you have never used SQL before, it might appear more complex first. But don’t give up here!
Alright, I won’t go into a more detail, but if you’re interested on how to use Query() – here’s a great article from CodingIsForLosers – ‘A weapon of mass laziness‘.
Today, I’d like to present a nifty solution to a problem I recently encountered – querying column names!
The problem: You can’t Query the Column Header by Name in Google Spreadsheets
What do I mean with that?
Let’s use Sample table with the following information:
In a normal function, Google Sheets would shift the reference accordingly – but not here. It will break the function. Not fun!
The solution: Query by Column Name
1. First, we need a formula that returns the position of the column.
We can use the ADDRESS() in combination with MATCH() for that. ADDRESS() returns the cell position as a string. The structure is as follows:
ADDRESS(row, column, [absolute_relative_mode], [use_a1_notation], [sheet])
row: this can be “1”, even if your header column is not in row 1 – because it just depends on the range that you provide.
column: this is unknown, so we’ll use the MATCH() function to find the number of the column we want to reference (e.g. in our example”salary”).
[absolute_relative_mode]– optional, “1” be default. 1 is row and column absolute(e.g. $A$1), 2 is row absolute and column relative (e.g. A$1), 3 is row relative and column absolute (e.g. $A1), and 4 is row and column relative (e.g. A1). We are using 4 in this example to keep it fixed.
use_a1_notation– optional, “TRUE” as default.
sheet– optional, absent by default. Needs to be changed if you’re referencing a different sheet (in our case we don’t).
What do we need to set for MATCH()?
Here’s the MATCH() structure:
MATCH(search_key, range, [search_type])
search_key– The value to search for. We’re using “salary” in our example.
range– This is the range of all of the possible header you want to reference. Note, it must be one-dimensional (e.g.
A1:F1). In our case, it’s just
1$:$1. ($ to make the reference absolute).
1by default. We want to make sure the function is searching for an exact match and the range is not sorted, so we’re using
Ok, so 🥁.. here’s our function:
E1, we only want to return the column letter (
2. Remove the row number from the returned cell.
We’ll just wrap the function with a simple SUBSTITUTE() function to replace ‘1’ (with nothing):
Note: This only works if you’re header column is in the first row. There are ways to make this more flexible, e.g. by searching for the first number in the string (the row), and remove everything after that (w/ LEFT() function). Add a comment if you’re having problems with that!
3. Combine it back into the Query() function
Adding this into the query function looks a bit complicated, but there’s a standard format for this:
- The referenced column must be wrapped in single quotes (”) for strings (if you’re referencing numbers, you don’t need that)
- Close and re-open the query with double quotes (“”)
- Use ampersands (&) to add the referenced cell to the query string
=query(A1:G9,"select avg("&SUBSTITUTE(ADDRESS(1,MATCH(D10,$1:$1,0),4),"1","")&") ")
Nice… so what’s so cool about this? Well, I mentioned that it makes the whole spreadsheet more versatile and less error prone. However, there’s more.
Bonus: How to use query() with drop down fields
Wouldn’t it be cool to have drop downs to get different values from our data range in case we need them?
Let’s assume we’re interested not only in the average salary, but also the average age. How do we get that?
We just need to change the value of our input cell (
D10 in my example). And we can use a dropdown by going to Data>Data validation and use the “List of Items” functionality.
We can even use this to change the aggregate function. maybe we’re interested in the MAX, MIN and SUM as well? Well, here you go. We’re just using another cell to provide these as drop downs, and then reference in the query:
=query(A1:G9,"select "&D11&"("&SUBSTITUTE(ADDRESS(1,MATCH(D10,$1:$1,0),4),"1","")&") ")
Here’s my sample spreadsheet if you’d like to play around or make a copy for yourself: Click to see sample spreadsheet
All working? Congrats! Now go and make some awesome spreadsheets 🤠 and if it was useful, please leave a comment, tweet me at @joradig or drop me a line on LinkedIn. It really keeps me going and motivated.