Applying a function to columns of two matrices producing their combinations Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern) Announcing the arrival of Valued Associate #679: Cesar Manara Unicorn Meta Zoo #1: Why another podcast?Applying correction to a time series in MatlabEliminating columns from a matrix where the entries in two rows are equalFinding columns of a matrix that are the same for combinations of eventsFunction that reorders the columns of a matrixMerging two square matricesHorizontally concatenating the entries of two matrices, and padding with NaNsHow to pass pre-generated values to an anonymous function to speed up a fit?
Putting class ranking in CV, but against dept guidelines
Why complex landing gears are used instead of simple,reliability and light weight muscle wire or shape memory alloys?
Is there hard evidence that the grant peer review system performs significantly better than random?
A term for a woman complaining about things/begging in a cute/childish way
What does 丫 mean? 丫是什么意思?
How to ask rejected full-time candidates to apply to teach individual courses?
Relating to the President and obstruction, were Mueller's conclusions preordained?
I got rid of Mac OSX and replaced it with linux but now I can't change it back to OSX or windows
Can an iPhone 7 be made to function as a NFC Tag?
Should a wizard buy fine inks every time he want to copy spells into his spellbook?
Weaponising the Grasp-at-a-Distance spell
i2c bus hangs in master RPi access to MSP430G uC ~1 in 1000 accesses
What is the difference between a "ranged attack" and a "ranged weapon attack"?
How does light 'choose' between wave and particle behaviour?
How do living politicians protect their readily obtainable signatures from misuse?
Why is the change of basis formula counter-intuitive? [See details]
Resize vertical bars (absolute-value symbols)
In musical terms, what properties are varied by the human voice to produce different words / syllables?
Is multiple magic items in one inherently imbalanced?
What is the role of と after a noun when it doesn't appear to count or list anything?
Monty Hall Problem-Probability Paradox
Where is the Next Backup Size entry on iOS 12?
Project Euler #1 in C++
Did Mueller's report provide an evidentiary basis for the claim of Russian govt election interference via social media?
Applying a function to columns of two matrices producing their combinations
Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern)
Announcing the arrival of Valued Associate #679: Cesar Manara
Unicorn Meta Zoo #1: Why another podcast?Applying correction to a time series in MatlabEliminating columns from a matrix where the entries in two rows are equalFinding columns of a matrix that are the same for combinations of eventsFunction that reorders the columns of a matrixMerging two square matricesHorizontally concatenating the entries of two matrices, and padding with NaNsHow to pass pre-generated values to an anonymous function to speed up a fit?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
$begingroup$
I made a short snippet for applying a function f
that takes two arguments (vectors) to columns of two matrices producing all possible combinations of those columns.
function V = cross_apply(f, V1, V2)
% Apply function f to columns of V1 and V2 producing
% a matrix where V(i, j) = f(V1(:,i), V2(:,j))
s1 = size(V1, 2);
s2 = size(V2, 2);
V = zeros(s1, s2);
for i = 1:s1
for j = 1:s2
V(i, j) = f(V1(:, i), V2(:, j));
end
end
end
An example of a use of this function could include approximating a function with a polynomial using a custom inner product (not a traditional dot product) here ip
and defined as a kind of integral.
close all
f = @(x) log(x);
P = @(x) x.^(0:3);
ip = @(x, y) trapz(x.*y) / (size(x, 1)-1);
t = linspace(1, 2, 10)';
V = P(t);
% Inner product matrices
A = cross_apply(ip, V, V);
b = cross_apply(ip, V, f(t));
% Coefficients for the polynomial
rr = rref([A, b])
coef = rr(:, end);
% Plot ln(t) and polynomial fit
t_ = linspace(1, 3, 1000)';
figure()
plot(t_, P(t_)*coef, 'r--')
grid on, hold on
plot(t_, f(t_), 'b:')
legend('fit', 'f(t)', 'Location', 'best')
Not using this function, one would write a whole bunch of inner products (ip
in the code) for matrices A
and b
. I didn't find a built-in function that does this, so I'm after two things:
- Is this already an implemented function in Matlab built-ins or some toolbox?
- Is there some improvement to be made to the function?
matrix matlab
$endgroup$
add a comment |
$begingroup$
I made a short snippet for applying a function f
that takes two arguments (vectors) to columns of two matrices producing all possible combinations of those columns.
function V = cross_apply(f, V1, V2)
% Apply function f to columns of V1 and V2 producing
% a matrix where V(i, j) = f(V1(:,i), V2(:,j))
s1 = size(V1, 2);
s2 = size(V2, 2);
V = zeros(s1, s2);
for i = 1:s1
for j = 1:s2
V(i, j) = f(V1(:, i), V2(:, j));
end
end
end
An example of a use of this function could include approximating a function with a polynomial using a custom inner product (not a traditional dot product) here ip
and defined as a kind of integral.
close all
f = @(x) log(x);
P = @(x) x.^(0:3);
ip = @(x, y) trapz(x.*y) / (size(x, 1)-1);
t = linspace(1, 2, 10)';
V = P(t);
% Inner product matrices
A = cross_apply(ip, V, V);
b = cross_apply(ip, V, f(t));
% Coefficients for the polynomial
rr = rref([A, b])
coef = rr(:, end);
% Plot ln(t) and polynomial fit
t_ = linspace(1, 3, 1000)';
figure()
plot(t_, P(t_)*coef, 'r--')
grid on, hold on
plot(t_, f(t_), 'b:')
legend('fit', 'f(t)', 'Location', 'best')
Not using this function, one would write a whole bunch of inner products (ip
in the code) for matrices A
and b
. I didn't find a built-in function that does this, so I'm after two things:
- Is this already an implemented function in Matlab built-ins or some toolbox?
- Is there some improvement to be made to the function?
matrix matlab
$endgroup$
add a comment |
$begingroup$
I made a short snippet for applying a function f
that takes two arguments (vectors) to columns of two matrices producing all possible combinations of those columns.
function V = cross_apply(f, V1, V2)
% Apply function f to columns of V1 and V2 producing
% a matrix where V(i, j) = f(V1(:,i), V2(:,j))
s1 = size(V1, 2);
s2 = size(V2, 2);
V = zeros(s1, s2);
for i = 1:s1
for j = 1:s2
V(i, j) = f(V1(:, i), V2(:, j));
end
end
end
An example of a use of this function could include approximating a function with a polynomial using a custom inner product (not a traditional dot product) here ip
and defined as a kind of integral.
close all
f = @(x) log(x);
P = @(x) x.^(0:3);
ip = @(x, y) trapz(x.*y) / (size(x, 1)-1);
t = linspace(1, 2, 10)';
V = P(t);
% Inner product matrices
A = cross_apply(ip, V, V);
b = cross_apply(ip, V, f(t));
% Coefficients for the polynomial
rr = rref([A, b])
coef = rr(:, end);
% Plot ln(t) and polynomial fit
t_ = linspace(1, 3, 1000)';
figure()
plot(t_, P(t_)*coef, 'r--')
grid on, hold on
plot(t_, f(t_), 'b:')
legend('fit', 'f(t)', 'Location', 'best')
Not using this function, one would write a whole bunch of inner products (ip
in the code) for matrices A
and b
. I didn't find a built-in function that does this, so I'm after two things:
- Is this already an implemented function in Matlab built-ins or some toolbox?
- Is there some improvement to be made to the function?
matrix matlab
$endgroup$
I made a short snippet for applying a function f
that takes two arguments (vectors) to columns of two matrices producing all possible combinations of those columns.
function V = cross_apply(f, V1, V2)
% Apply function f to columns of V1 and V2 producing
% a matrix where V(i, j) = f(V1(:,i), V2(:,j))
s1 = size(V1, 2);
s2 = size(V2, 2);
V = zeros(s1, s2);
for i = 1:s1
for j = 1:s2
V(i, j) = f(V1(:, i), V2(:, j));
end
end
end
An example of a use of this function could include approximating a function with a polynomial using a custom inner product (not a traditional dot product) here ip
and defined as a kind of integral.
close all
f = @(x) log(x);
P = @(x) x.^(0:3);
ip = @(x, y) trapz(x.*y) / (size(x, 1)-1);
t = linspace(1, 2, 10)';
V = P(t);
% Inner product matrices
A = cross_apply(ip, V, V);
b = cross_apply(ip, V, f(t));
% Coefficients for the polynomial
rr = rref([A, b])
coef = rr(:, end);
% Plot ln(t) and polynomial fit
t_ = linspace(1, 3, 1000)';
figure()
plot(t_, P(t_)*coef, 'r--')
grid on, hold on
plot(t_, f(t_), 'b:')
legend('fit', 'f(t)', 'Location', 'best')
Not using this function, one would write a whole bunch of inner products (ip
in the code) for matrices A
and b
. I didn't find a built-in function that does this, so I'm after two things:
- Is this already an implemented function in Matlab built-ins or some toolbox?
- Is there some improvement to be made to the function?
matrix matlab
matrix matlab
edited 17 mins ago
Jamal♦
30.6k11121227
30.6k11121227
asked Oct 3 '18 at 8:33
FelixFelix
1556
1556
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
Your code is clear and concise, there is not much to complain about it, except for one small thing:
t = linspace(1, 2, 10)';
In MATLAB, '
is the complex conjugate transpose, whereas .'
is the non-conjugate transpose, the simple re-ordering of dimensions. The array it is applied to here is real-valued, so there is no difference, but it is good practice to use .'
always when the intention is to change the orientation of a vector. Bugs do appear at times because of the improper use of the '
operator.
As I've said before, loops are no longer slow in MATLAB, but often it's still possible to remove them ("vectorize the code") for a time gain. If the vectorization doesn't involve creating intermediate copies of the data, and doesn't involve complex indexing, then vectorization is still profitable.
In this particular case, vectorization involves calling a function one time vs calling it 10x10=100 times, and function calls still have a significant overhead in MATLAB. So cutting down on function calls will be beneficial.
Producing all possible combinations of two values can be accomplished with the bsxfun
function in MATLAB. But in this case, we want to produce all combinations of two columns, and there is no function for that. However, you could use it to generate the product of all columns (the argument to trapz
in your function ip
). This requires a large intermediate matrix to be generated, but in this case (100 vectors of length 4), this is a very mild amount of data, and well worth the price if it means not calling a function 100 times.
Note that bsxfun
is, since R2016b, no longer necessary. Everything you could previously do with bsxfun
can now be done without it. For example, given x=1:10
, you can do:
y = x .* x.';
to produce a 10x10 matrix with the product of all combinations of elements of x
. This implicit singleton expansion happens automatically when the two arguments to any operator have compatible dimensions. That means that their sizes are identical along each dimension, or one of them has size 1 along a dimension (a singleton dimension). That singleton dimension gets repeated virtually to match the size of the other matrix.
Your function ip
is written such that, given two input matrices x
and y
of compatible dimensions, it will compute your custom inner product along the first dimension, as long as that dimension has a size larger than 1. To make this function safer in use, you can explicitly tell trapz
along which dimension to operate:
ip = @(x, y) trapz(x.*y, 1) / (size(x, 1)-1);
Now, ip(x,y)
will compute many custom inner products at once, if x
and y
are matrices. However, ip(V, V)
will compute the inner product of each column in V
with itself, not all combinations of columns. For that, we need to do a little bit of reshaping of the inputs:
function V = cross_apply2(f, V1, V2)
s1 = size(V1, 2);
s2 = size(V2, 2);
V2 = reshape(V2, [], 1, s2);
V = f(V1, V2);
V = reshape(V, s1, s2);
What this code does is convert the second input matrix of size NxM to a matrix of size Nx1xM (this can be done without copying the data). Now the function f
(really ip
the way you call it) will use implicit singleton expansion to compute the custom inner product of each combination of columns of V1
and V2
.
You can verify that cross_apply
and cross_apply2
yield exactly the same result:
A = cross_apply(ip, V, V);
b = cross_apply(ip, V, f(t));
A2 = cross_apply2(ip, V, V);
assert(isequal(A,A2))
b2 = cross_apply2(ip, V, f(t));
assert(isequal(b,b2))
And we can also see what the time savings are for calling the function ip
once instead of 100 times:
timeit(@()cross_apply(ip, V, V))
timeit(@()cross_apply2(ip, V, V))
On my computer this is 1.5016e-04 vs 2.9650e-05, a 5x speedup.
But cross_apply2
is not as general as cross_apply
, as it imposes a requirement on the function to be applied to be vectorized.
trapz
is a funny function. In your case, where it is called without an x
input argument, it sets x
to 1. The trapezoid rule now is simply the sum of the elements of the input vector, minus half the first and half the last element. I don't know what you application even is, but have you considered replacing this with the simple sum? Computing the true dot product is a lot faster.
You can do edit trapz
to see how trapz
is implemented. For the way you call it, it does:
z = sum((y(1:end-1,:) + y(2:end,:)), 1)/2;
This is the same, as I said before, as:
z = sum(y(:,:)) - y(1,:)/2 - y(end,:)/2;
But this second form is twice as fast for large arrays: The first form copies the data in y
twice (minus one row), adds these results and sums them up, whereas the second form simply adds up all elements, then subtracts half of the first row and half the last row. For a random large array y
:
y = rand(1000,100,100);
z1 = sum((y(1:end-1,:) + y(2:end,:)), 1)/2;
z2 = sum(y(:,:)) - y(1,:)/2 - y(end,:)/2;
max(abs(z1(:)-z2(:)))
timeit(@()sum((y(1:end-1,:) + y(2:end,:)), 1)/2)
timeit(@()sum(y(:,:)) - y(1,:)/2 - y(end,:)/2)
This shows that the largest difference between the two results is rounding errors (4.8317e-13), and the timings are 0.0778 vs 0.0341.
$endgroup$
$begingroup$
Thank you very much! About usingtrapz
: this was a simple assignment with an integral as the inner product, but a constant set of sampling points equidistant to one another, so I thought the minimisation doesn't change. Yeah, I hadn't thought about the function call overhead! And I do think assuming vectorised functions is fine for this case and even a reasonable constraint in general. Another version ofcross_apply
could be made if the vanilla is needed.
$endgroup$
– Felix
Oct 6 '18 at 7:33
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
var $window = $(window),
onScroll = function(e)
var $elem = $('.new-login-left'),
docViewTop = $window.scrollTop(),
docViewBottom = docViewTop + $window.height(),
elemTop = $elem.offset().top,
elemBottom = elemTop + $elem.height();
if ((docViewTop elemBottom))
StackExchange.using('gps', function() StackExchange.gps.track('embedded_signup_form.view', location: 'question_page' ); );
$window.unbind('scroll', onScroll);
;
$window.on('scroll', onScroll);
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f204834%2fapplying-a-function-to-columns-of-two-matrices-producing-their-combinations%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Your code is clear and concise, there is not much to complain about it, except for one small thing:
t = linspace(1, 2, 10)';
In MATLAB, '
is the complex conjugate transpose, whereas .'
is the non-conjugate transpose, the simple re-ordering of dimensions. The array it is applied to here is real-valued, so there is no difference, but it is good practice to use .'
always when the intention is to change the orientation of a vector. Bugs do appear at times because of the improper use of the '
operator.
As I've said before, loops are no longer slow in MATLAB, but often it's still possible to remove them ("vectorize the code") for a time gain. If the vectorization doesn't involve creating intermediate copies of the data, and doesn't involve complex indexing, then vectorization is still profitable.
In this particular case, vectorization involves calling a function one time vs calling it 10x10=100 times, and function calls still have a significant overhead in MATLAB. So cutting down on function calls will be beneficial.
Producing all possible combinations of two values can be accomplished with the bsxfun
function in MATLAB. But in this case, we want to produce all combinations of two columns, and there is no function for that. However, you could use it to generate the product of all columns (the argument to trapz
in your function ip
). This requires a large intermediate matrix to be generated, but in this case (100 vectors of length 4), this is a very mild amount of data, and well worth the price if it means not calling a function 100 times.
Note that bsxfun
is, since R2016b, no longer necessary. Everything you could previously do with bsxfun
can now be done without it. For example, given x=1:10
, you can do:
y = x .* x.';
to produce a 10x10 matrix with the product of all combinations of elements of x
. This implicit singleton expansion happens automatically when the two arguments to any operator have compatible dimensions. That means that their sizes are identical along each dimension, or one of them has size 1 along a dimension (a singleton dimension). That singleton dimension gets repeated virtually to match the size of the other matrix.
Your function ip
is written such that, given two input matrices x
and y
of compatible dimensions, it will compute your custom inner product along the first dimension, as long as that dimension has a size larger than 1. To make this function safer in use, you can explicitly tell trapz
along which dimension to operate:
ip = @(x, y) trapz(x.*y, 1) / (size(x, 1)-1);
Now, ip(x,y)
will compute many custom inner products at once, if x
and y
are matrices. However, ip(V, V)
will compute the inner product of each column in V
with itself, not all combinations of columns. For that, we need to do a little bit of reshaping of the inputs:
function V = cross_apply2(f, V1, V2)
s1 = size(V1, 2);
s2 = size(V2, 2);
V2 = reshape(V2, [], 1, s2);
V = f(V1, V2);
V = reshape(V, s1, s2);
What this code does is convert the second input matrix of size NxM to a matrix of size Nx1xM (this can be done without copying the data). Now the function f
(really ip
the way you call it) will use implicit singleton expansion to compute the custom inner product of each combination of columns of V1
and V2
.
You can verify that cross_apply
and cross_apply2
yield exactly the same result:
A = cross_apply(ip, V, V);
b = cross_apply(ip, V, f(t));
A2 = cross_apply2(ip, V, V);
assert(isequal(A,A2))
b2 = cross_apply2(ip, V, f(t));
assert(isequal(b,b2))
And we can also see what the time savings are for calling the function ip
once instead of 100 times:
timeit(@()cross_apply(ip, V, V))
timeit(@()cross_apply2(ip, V, V))
On my computer this is 1.5016e-04 vs 2.9650e-05, a 5x speedup.
But cross_apply2
is not as general as cross_apply
, as it imposes a requirement on the function to be applied to be vectorized.
trapz
is a funny function. In your case, where it is called without an x
input argument, it sets x
to 1. The trapezoid rule now is simply the sum of the elements of the input vector, minus half the first and half the last element. I don't know what you application even is, but have you considered replacing this with the simple sum? Computing the true dot product is a lot faster.
You can do edit trapz
to see how trapz
is implemented. For the way you call it, it does:
z = sum((y(1:end-1,:) + y(2:end,:)), 1)/2;
This is the same, as I said before, as:
z = sum(y(:,:)) - y(1,:)/2 - y(end,:)/2;
But this second form is twice as fast for large arrays: The first form copies the data in y
twice (minus one row), adds these results and sums them up, whereas the second form simply adds up all elements, then subtracts half of the first row and half the last row. For a random large array y
:
y = rand(1000,100,100);
z1 = sum((y(1:end-1,:) + y(2:end,:)), 1)/2;
z2 = sum(y(:,:)) - y(1,:)/2 - y(end,:)/2;
max(abs(z1(:)-z2(:)))
timeit(@()sum((y(1:end-1,:) + y(2:end,:)), 1)/2)
timeit(@()sum(y(:,:)) - y(1,:)/2 - y(end,:)/2)
This shows that the largest difference between the two results is rounding errors (4.8317e-13), and the timings are 0.0778 vs 0.0341.
$endgroup$
$begingroup$
Thank you very much! About usingtrapz
: this was a simple assignment with an integral as the inner product, but a constant set of sampling points equidistant to one another, so I thought the minimisation doesn't change. Yeah, I hadn't thought about the function call overhead! And I do think assuming vectorised functions is fine for this case and even a reasonable constraint in general. Another version ofcross_apply
could be made if the vanilla is needed.
$endgroup$
– Felix
Oct 6 '18 at 7:33
add a comment |
$begingroup$
Your code is clear and concise, there is not much to complain about it, except for one small thing:
t = linspace(1, 2, 10)';
In MATLAB, '
is the complex conjugate transpose, whereas .'
is the non-conjugate transpose, the simple re-ordering of dimensions. The array it is applied to here is real-valued, so there is no difference, but it is good practice to use .'
always when the intention is to change the orientation of a vector. Bugs do appear at times because of the improper use of the '
operator.
As I've said before, loops are no longer slow in MATLAB, but often it's still possible to remove them ("vectorize the code") for a time gain. If the vectorization doesn't involve creating intermediate copies of the data, and doesn't involve complex indexing, then vectorization is still profitable.
In this particular case, vectorization involves calling a function one time vs calling it 10x10=100 times, and function calls still have a significant overhead in MATLAB. So cutting down on function calls will be beneficial.
Producing all possible combinations of two values can be accomplished with the bsxfun
function in MATLAB. But in this case, we want to produce all combinations of two columns, and there is no function for that. However, you could use it to generate the product of all columns (the argument to trapz
in your function ip
). This requires a large intermediate matrix to be generated, but in this case (100 vectors of length 4), this is a very mild amount of data, and well worth the price if it means not calling a function 100 times.
Note that bsxfun
is, since R2016b, no longer necessary. Everything you could previously do with bsxfun
can now be done without it. For example, given x=1:10
, you can do:
y = x .* x.';
to produce a 10x10 matrix with the product of all combinations of elements of x
. This implicit singleton expansion happens automatically when the two arguments to any operator have compatible dimensions. That means that their sizes are identical along each dimension, or one of them has size 1 along a dimension (a singleton dimension). That singleton dimension gets repeated virtually to match the size of the other matrix.
Your function ip
is written such that, given two input matrices x
and y
of compatible dimensions, it will compute your custom inner product along the first dimension, as long as that dimension has a size larger than 1. To make this function safer in use, you can explicitly tell trapz
along which dimension to operate:
ip = @(x, y) trapz(x.*y, 1) / (size(x, 1)-1);
Now, ip(x,y)
will compute many custom inner products at once, if x
and y
are matrices. However, ip(V, V)
will compute the inner product of each column in V
with itself, not all combinations of columns. For that, we need to do a little bit of reshaping of the inputs:
function V = cross_apply2(f, V1, V2)
s1 = size(V1, 2);
s2 = size(V2, 2);
V2 = reshape(V2, [], 1, s2);
V = f(V1, V2);
V = reshape(V, s1, s2);
What this code does is convert the second input matrix of size NxM to a matrix of size Nx1xM (this can be done without copying the data). Now the function f
(really ip
the way you call it) will use implicit singleton expansion to compute the custom inner product of each combination of columns of V1
and V2
.
You can verify that cross_apply
and cross_apply2
yield exactly the same result:
A = cross_apply(ip, V, V);
b = cross_apply(ip, V, f(t));
A2 = cross_apply2(ip, V, V);
assert(isequal(A,A2))
b2 = cross_apply2(ip, V, f(t));
assert(isequal(b,b2))
And we can also see what the time savings are for calling the function ip
once instead of 100 times:
timeit(@()cross_apply(ip, V, V))
timeit(@()cross_apply2(ip, V, V))
On my computer this is 1.5016e-04 vs 2.9650e-05, a 5x speedup.
But cross_apply2
is not as general as cross_apply
, as it imposes a requirement on the function to be applied to be vectorized.
trapz
is a funny function. In your case, where it is called without an x
input argument, it sets x
to 1. The trapezoid rule now is simply the sum of the elements of the input vector, minus half the first and half the last element. I don't know what you application even is, but have you considered replacing this with the simple sum? Computing the true dot product is a lot faster.
You can do edit trapz
to see how trapz
is implemented. For the way you call it, it does:
z = sum((y(1:end-1,:) + y(2:end,:)), 1)/2;
This is the same, as I said before, as:
z = sum(y(:,:)) - y(1,:)/2 - y(end,:)/2;
But this second form is twice as fast for large arrays: The first form copies the data in y
twice (minus one row), adds these results and sums them up, whereas the second form simply adds up all elements, then subtracts half of the first row and half the last row. For a random large array y
:
y = rand(1000,100,100);
z1 = sum((y(1:end-1,:) + y(2:end,:)), 1)/2;
z2 = sum(y(:,:)) - y(1,:)/2 - y(end,:)/2;
max(abs(z1(:)-z2(:)))
timeit(@()sum((y(1:end-1,:) + y(2:end,:)), 1)/2)
timeit(@()sum(y(:,:)) - y(1,:)/2 - y(end,:)/2)
This shows that the largest difference between the two results is rounding errors (4.8317e-13), and the timings are 0.0778 vs 0.0341.
$endgroup$
$begingroup$
Thank you very much! About usingtrapz
: this was a simple assignment with an integral as the inner product, but a constant set of sampling points equidistant to one another, so I thought the minimisation doesn't change. Yeah, I hadn't thought about the function call overhead! And I do think assuming vectorised functions is fine for this case and even a reasonable constraint in general. Another version ofcross_apply
could be made if the vanilla is needed.
$endgroup$
– Felix
Oct 6 '18 at 7:33
add a comment |
$begingroup$
Your code is clear and concise, there is not much to complain about it, except for one small thing:
t = linspace(1, 2, 10)';
In MATLAB, '
is the complex conjugate transpose, whereas .'
is the non-conjugate transpose, the simple re-ordering of dimensions. The array it is applied to here is real-valued, so there is no difference, but it is good practice to use .'
always when the intention is to change the orientation of a vector. Bugs do appear at times because of the improper use of the '
operator.
As I've said before, loops are no longer slow in MATLAB, but often it's still possible to remove them ("vectorize the code") for a time gain. If the vectorization doesn't involve creating intermediate copies of the data, and doesn't involve complex indexing, then vectorization is still profitable.
In this particular case, vectorization involves calling a function one time vs calling it 10x10=100 times, and function calls still have a significant overhead in MATLAB. So cutting down on function calls will be beneficial.
Producing all possible combinations of two values can be accomplished with the bsxfun
function in MATLAB. But in this case, we want to produce all combinations of two columns, and there is no function for that. However, you could use it to generate the product of all columns (the argument to trapz
in your function ip
). This requires a large intermediate matrix to be generated, but in this case (100 vectors of length 4), this is a very mild amount of data, and well worth the price if it means not calling a function 100 times.
Note that bsxfun
is, since R2016b, no longer necessary. Everything you could previously do with bsxfun
can now be done without it. For example, given x=1:10
, you can do:
y = x .* x.';
to produce a 10x10 matrix with the product of all combinations of elements of x
. This implicit singleton expansion happens automatically when the two arguments to any operator have compatible dimensions. That means that their sizes are identical along each dimension, or one of them has size 1 along a dimension (a singleton dimension). That singleton dimension gets repeated virtually to match the size of the other matrix.
Your function ip
is written such that, given two input matrices x
and y
of compatible dimensions, it will compute your custom inner product along the first dimension, as long as that dimension has a size larger than 1. To make this function safer in use, you can explicitly tell trapz
along which dimension to operate:
ip = @(x, y) trapz(x.*y, 1) / (size(x, 1)-1);
Now, ip(x,y)
will compute many custom inner products at once, if x
and y
are matrices. However, ip(V, V)
will compute the inner product of each column in V
with itself, not all combinations of columns. For that, we need to do a little bit of reshaping of the inputs:
function V = cross_apply2(f, V1, V2)
s1 = size(V1, 2);
s2 = size(V2, 2);
V2 = reshape(V2, [], 1, s2);
V = f(V1, V2);
V = reshape(V, s1, s2);
What this code does is convert the second input matrix of size NxM to a matrix of size Nx1xM (this can be done without copying the data). Now the function f
(really ip
the way you call it) will use implicit singleton expansion to compute the custom inner product of each combination of columns of V1
and V2
.
You can verify that cross_apply
and cross_apply2
yield exactly the same result:
A = cross_apply(ip, V, V);
b = cross_apply(ip, V, f(t));
A2 = cross_apply2(ip, V, V);
assert(isequal(A,A2))
b2 = cross_apply2(ip, V, f(t));
assert(isequal(b,b2))
And we can also see what the time savings are for calling the function ip
once instead of 100 times:
timeit(@()cross_apply(ip, V, V))
timeit(@()cross_apply2(ip, V, V))
On my computer this is 1.5016e-04 vs 2.9650e-05, a 5x speedup.
But cross_apply2
is not as general as cross_apply
, as it imposes a requirement on the function to be applied to be vectorized.
trapz
is a funny function. In your case, where it is called without an x
input argument, it sets x
to 1. The trapezoid rule now is simply the sum of the elements of the input vector, minus half the first and half the last element. I don't know what you application even is, but have you considered replacing this with the simple sum? Computing the true dot product is a lot faster.
You can do edit trapz
to see how trapz
is implemented. For the way you call it, it does:
z = sum((y(1:end-1,:) + y(2:end,:)), 1)/2;
This is the same, as I said before, as:
z = sum(y(:,:)) - y(1,:)/2 - y(end,:)/2;
But this second form is twice as fast for large arrays: The first form copies the data in y
twice (minus one row), adds these results and sums them up, whereas the second form simply adds up all elements, then subtracts half of the first row and half the last row. For a random large array y
:
y = rand(1000,100,100);
z1 = sum((y(1:end-1,:) + y(2:end,:)), 1)/2;
z2 = sum(y(:,:)) - y(1,:)/2 - y(end,:)/2;
max(abs(z1(:)-z2(:)))
timeit(@()sum((y(1:end-1,:) + y(2:end,:)), 1)/2)
timeit(@()sum(y(:,:)) - y(1,:)/2 - y(end,:)/2)
This shows that the largest difference between the two results is rounding errors (4.8317e-13), and the timings are 0.0778 vs 0.0341.
$endgroup$
Your code is clear and concise, there is not much to complain about it, except for one small thing:
t = linspace(1, 2, 10)';
In MATLAB, '
is the complex conjugate transpose, whereas .'
is the non-conjugate transpose, the simple re-ordering of dimensions. The array it is applied to here is real-valued, so there is no difference, but it is good practice to use .'
always when the intention is to change the orientation of a vector. Bugs do appear at times because of the improper use of the '
operator.
As I've said before, loops are no longer slow in MATLAB, but often it's still possible to remove them ("vectorize the code") for a time gain. If the vectorization doesn't involve creating intermediate copies of the data, and doesn't involve complex indexing, then vectorization is still profitable.
In this particular case, vectorization involves calling a function one time vs calling it 10x10=100 times, and function calls still have a significant overhead in MATLAB. So cutting down on function calls will be beneficial.
Producing all possible combinations of two values can be accomplished with the bsxfun
function in MATLAB. But in this case, we want to produce all combinations of two columns, and there is no function for that. However, you could use it to generate the product of all columns (the argument to trapz
in your function ip
). This requires a large intermediate matrix to be generated, but in this case (100 vectors of length 4), this is a very mild amount of data, and well worth the price if it means not calling a function 100 times.
Note that bsxfun
is, since R2016b, no longer necessary. Everything you could previously do with bsxfun
can now be done without it. For example, given x=1:10
, you can do:
y = x .* x.';
to produce a 10x10 matrix with the product of all combinations of elements of x
. This implicit singleton expansion happens automatically when the two arguments to any operator have compatible dimensions. That means that their sizes are identical along each dimension, or one of them has size 1 along a dimension (a singleton dimension). That singleton dimension gets repeated virtually to match the size of the other matrix.
Your function ip
is written such that, given two input matrices x
and y
of compatible dimensions, it will compute your custom inner product along the first dimension, as long as that dimension has a size larger than 1. To make this function safer in use, you can explicitly tell trapz
along which dimension to operate:
ip = @(x, y) trapz(x.*y, 1) / (size(x, 1)-1);
Now, ip(x,y)
will compute many custom inner products at once, if x
and y
are matrices. However, ip(V, V)
will compute the inner product of each column in V
with itself, not all combinations of columns. For that, we need to do a little bit of reshaping of the inputs:
function V = cross_apply2(f, V1, V2)
s1 = size(V1, 2);
s2 = size(V2, 2);
V2 = reshape(V2, [], 1, s2);
V = f(V1, V2);
V = reshape(V, s1, s2);
What this code does is convert the second input matrix of size NxM to a matrix of size Nx1xM (this can be done without copying the data). Now the function f
(really ip
the way you call it) will use implicit singleton expansion to compute the custom inner product of each combination of columns of V1
and V2
.
You can verify that cross_apply
and cross_apply2
yield exactly the same result:
A = cross_apply(ip, V, V);
b = cross_apply(ip, V, f(t));
A2 = cross_apply2(ip, V, V);
assert(isequal(A,A2))
b2 = cross_apply2(ip, V, f(t));
assert(isequal(b,b2))
And we can also see what the time savings are for calling the function ip
once instead of 100 times:
timeit(@()cross_apply(ip, V, V))
timeit(@()cross_apply2(ip, V, V))
On my computer this is 1.5016e-04 vs 2.9650e-05, a 5x speedup.
But cross_apply2
is not as general as cross_apply
, as it imposes a requirement on the function to be applied to be vectorized.
trapz
is a funny function. In your case, where it is called without an x
input argument, it sets x
to 1. The trapezoid rule now is simply the sum of the elements of the input vector, minus half the first and half the last element. I don't know what you application even is, but have you considered replacing this with the simple sum? Computing the true dot product is a lot faster.
You can do edit trapz
to see how trapz
is implemented. For the way you call it, it does:
z = sum((y(1:end-1,:) + y(2:end,:)), 1)/2;
This is the same, as I said before, as:
z = sum(y(:,:)) - y(1,:)/2 - y(end,:)/2;
But this second form is twice as fast for large arrays: The first form copies the data in y
twice (minus one row), adds these results and sums them up, whereas the second form simply adds up all elements, then subtracts half of the first row and half the last row. For a random large array y
:
y = rand(1000,100,100);
z1 = sum((y(1:end-1,:) + y(2:end,:)), 1)/2;
z2 = sum(y(:,:)) - y(1,:)/2 - y(end,:)/2;
max(abs(z1(:)-z2(:)))
timeit(@()sum((y(1:end-1,:) + y(2:end,:)), 1)/2)
timeit(@()sum(y(:,:)) - y(1,:)/2 - y(end,:)/2)
This shows that the largest difference between the two results is rounding errors (4.8317e-13), and the timings are 0.0778 vs 0.0341.
answered Oct 6 '18 at 7:07
Cris LuengoCris Luengo
2,619320
2,619320
$begingroup$
Thank you very much! About usingtrapz
: this was a simple assignment with an integral as the inner product, but a constant set of sampling points equidistant to one another, so I thought the minimisation doesn't change. Yeah, I hadn't thought about the function call overhead! And I do think assuming vectorised functions is fine for this case and even a reasonable constraint in general. Another version ofcross_apply
could be made if the vanilla is needed.
$endgroup$
– Felix
Oct 6 '18 at 7:33
add a comment |
$begingroup$
Thank you very much! About usingtrapz
: this was a simple assignment with an integral as the inner product, but a constant set of sampling points equidistant to one another, so I thought the minimisation doesn't change. Yeah, I hadn't thought about the function call overhead! And I do think assuming vectorised functions is fine for this case and even a reasonable constraint in general. Another version ofcross_apply
could be made if the vanilla is needed.
$endgroup$
– Felix
Oct 6 '18 at 7:33
$begingroup$
Thank you very much! About using
trapz
: this was a simple assignment with an integral as the inner product, but a constant set of sampling points equidistant to one another, so I thought the minimisation doesn't change. Yeah, I hadn't thought about the function call overhead! And I do think assuming vectorised functions is fine for this case and even a reasonable constraint in general. Another version of cross_apply
could be made if the vanilla is needed.$endgroup$
– Felix
Oct 6 '18 at 7:33
$begingroup$
Thank you very much! About using
trapz
: this was a simple assignment with an integral as the inner product, but a constant set of sampling points equidistant to one another, so I thought the minimisation doesn't change. Yeah, I hadn't thought about the function call overhead! And I do think assuming vectorised functions is fine for this case and even a reasonable constraint in general. Another version of cross_apply
could be made if the vanilla is needed.$endgroup$
– Felix
Oct 6 '18 at 7:33
add a comment |
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
var $window = $(window),
onScroll = function(e)
var $elem = $('.new-login-left'),
docViewTop = $window.scrollTop(),
docViewBottom = docViewTop + $window.height(),
elemTop = $elem.offset().top,
elemBottom = elemTop + $elem.height();
if ((docViewTop elemBottom))
StackExchange.using('gps', function() StackExchange.gps.track('embedded_signup_form.view', location: 'question_page' ); );
$window.unbind('scroll', onScroll);
;
$window.on('scroll', onScroll);
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f204834%2fapplying-a-function-to-columns-of-two-matrices-producing-their-combinations%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
var $window = $(window),
onScroll = function(e)
var $elem = $('.new-login-left'),
docViewTop = $window.scrollTop(),
docViewBottom = docViewTop + $window.height(),
elemTop = $elem.offset().top,
elemBottom = elemTop + $elem.height();
if ((docViewTop elemBottom))
StackExchange.using('gps', function() StackExchange.gps.track('embedded_signup_form.view', location: 'question_page' ); );
$window.unbind('scroll', onScroll);
;
$window.on('scroll', onScroll);
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
var $window = $(window),
onScroll = function(e)
var $elem = $('.new-login-left'),
docViewTop = $window.scrollTop(),
docViewBottom = docViewTop + $window.height(),
elemTop = $elem.offset().top,
elemBottom = elemTop + $elem.height();
if ((docViewTop elemBottom))
StackExchange.using('gps', function() StackExchange.gps.track('embedded_signup_form.view', location: 'question_page' ); );
$window.unbind('scroll', onScroll);
;
$window.on('scroll', onScroll);
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
var $window = $(window),
onScroll = function(e)
var $elem = $('.new-login-left'),
docViewTop = $window.scrollTop(),
docViewBottom = docViewTop + $window.height(),
elemTop = $elem.offset().top,
elemBottom = elemTop + $elem.height();
if ((docViewTop elemBottom))
StackExchange.using('gps', function() StackExchange.gps.track('embedded_signup_form.view', location: 'question_page' ); );
$window.unbind('scroll', onScroll);
;
$window.on('scroll', onScroll);
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown